Skip to content

Latest commit

 

History

History
356 lines (247 loc) · 26 KB

README.md

File metadata and controls

356 lines (247 loc) · 26 KB

Manuscript Build Status Docker Repository on Quay docker pulls docker stars

RNA Galaxy Workbench

The RNA Galaxy workbench is a comprehensive set of analysis tools and consolidated workflows. The workbench is based on the Galaxy framework, which guarantees simple access, easy extension, flexible adaption to personal and security needs, and sophisticated analyses independent of command-line knowledge. The workbench is described in two manuscripts published in Nucleic Acid Research (see version1, version2).

The current implementation comprises more than 50 bioinformatics tools dedicated to different research areas of RNA biology, including RNA structure analysis, RNA alignment, RNA annotation, RNA-protein interaction, ribosome profiling, RNA-Seq analysis, and RNA target prediction.

The workbench is developed by the RNA Bioinformatics Center (RBC). This center is one of the eight service units of the German Network for Bioinformatics Infrastructure, running the German ELIXIR Node.

de.NBI ELIXIR Germany

Usage

The RNA analyses workbench implements a webserver based on the Galaxy Docker platform: a dedicated Galaxy instance wrapped in a Docker container. For advanced local deployments, we recommend to check out the upstream documentation. The workbench is directly use and testable as instance of usegalaxy.eu rna.usegalaxy.eu.

▲ back to top

Requirement

To use the Galaxy RNA workbench, you only need Docker, which can be installed in different ways, depending on the type of system you're running:

  • non-linux users are encouraged to use Kitematic, which provides a Docker installation for OSX or Windows, coupled with a user friendly interface to run Docker containers;
  • linux users and people familiar with the command line can follow the instruction on installing Docker from its website.

▲ back to top

Docker configuration

The RNA workbench docker container is rather large and expected to grow when further tools and workflows are contributed. So for users new to docker, we list here some tweaks that can help to work around issues when first using docker. After successful installation of docker, it is recommended to configure some settings, dealing for example with the storage space required by containers. You can find more information here.

▲ back to top

RNA workbench launch

Whether you run Docker images using Kitematic or the command line interface, the procedure to launch the RNA workbench varies.

Using Kitematic

Kitematic users can launch the RNA workbench directly from its interface. The following video shows how to load the docker container that is necessary to use the workbench:

Kitematic galaxy-rna-workbench launch

▲ back to top

Without Kitematic

For non-Kitematic users, starting the RNA workbench is analogous to start the generic Galaxy Docker image:

$ docker run -d -p 8080:80 quay.io/bgruening/galaxy-rna-workbench

A detailed discussion of Docker's parameters is given in the Docker manual. It is really worth reading. Nevertheless, here is a quick rundown:

  • docker run starts the Image/Container

    In case the Container is not already stored locally, docker downloads it automatically

  • The argument -p 8080:80 makes the port 80 (inside of the container) available on port 8080 on your host

    Inside the container a Apache web server is running on port 80 and that port can be bound to a local port on your host computer. With this parameter you can access your Galaxy instance via http://localhost:8080 immediately after executing the command above

  • quay.io/bgruening/galaxy-rna-workbench is the Image/Container name, that directs docker to the correct path in the docker index

  • -d will start the docker container in Daemon mode.

    For an interactive session, one executes:

    $ docker run -i -t -p 8080:80 quay.io/bgruening/galaxy-rna-workbench /bin/bash
    

    and manually invokes the startup script to start PostgreSQL, Apache and Galaxy.

Docker images are "read-only". All changes during one session are lost after restart. This mode is useful to present Galaxy to your colleagues or to run workshops with it.

To install Tool Shed repositories or to save your data, you need to export the calculated data to the host computer. Fortunately, this is as easy as:

$ docker run -d -p 8080:80 -v /home/user/galaxy_storage/:/export/ quay.io/bgruening/galaxy-rna-workbench

Given the additional -v /home/user/galaxy_storage/:/export/ parameter, docker will mount the folder /home/user/galaxy_storage into the Container under /export/. A startup.sh script, that is usually starting Apache, PostgreSQL and Galaxy, will recognize the export directory with one of the following outcomes:

  • In case of an empty /export/ directory, it will move the PostgreSQL database, the Galaxy database directory, Shed Tools and Tool Dependencies and various configure scripts to /export/ and symlink back to the original location.
  • In case of a non-empty /export/, for example if you continue a previous session within the same folder, nothing will be moved, but the symlinks will be created.

This enables you to have different export folders for different sessions - meaning real separation of your different projects.

It will start the Galaxy RNA workbench with the configuration and launch of a Galaxy instance and its population with the needed tools. The instance will be accessible at http://localhost:8080.

For a more specific configuration, you can have a look at the documentation of the Galaxy Docker Image.

▲ back to top

Users and passwords

The Galaxy Admin User has the username [email protected] and the password admin. In order to use certain features of Galaxy, like e.g. the RNA structure visualization, one has to be logged in. Also the installation of additional tools requires a login.

The PostgreSQL username is galaxy, the password galaxy and the database name galaxy.

If you want to create new users, please make sure to use the /export/ volume. Otherwise your user will be removed after your docker session is finished.

▲ back to top

Tours

The RNA workbench provides the possibility to run interactive tours that illustrate how the main interface works in relation to real-life user tasks. These show many common operations, such as searching, parametrizing, and running tools, or saving a history of operations in a sharable workflow.

The following video demonstrates the main elements that compose the Galaxy user interface:

Galaxy UI tour

▲ back to top

Available tools

In this section we list all tools that have been integrated in the RNA workbench. The list is likely to grow as soon as further tools and workflows are contributed. To ease readability, we divided them into categories.

▲ back to top

RNA structure prediction and analysis

Tool Description Reference
antaRNA Possibility of inverse RNA structure folding and a specification of a GC value constraint Kleinkauf et al. 2015
CoFold A thermodynamics-based RNA secondary structure folding algorithm Proctor et al. 2013
CMCompare Tool to compare RNA families via covariance models Eggenhofer et al. 2013
Kinwalker Algorithm for cotranscriptional folding of RNAs to obtain the min. free energy structure Geis et al. 2008
MEA Prediction of maximum expected accuracy RNA secondary structures Amman et al. 2013
RNAlien A tool for RNA family model construction Eggenhofer et al. 2016
RNAshapes Structures to a tree-like domain of shapes, retaining adjacency and nesting of structural features Janssen et al. 2014
RNAz Predicts structurally conserved and therm. stable RNA secondary structures in mult. seq. alignments Gruber et al. 2010
segmentation-fold An application that predicts RNA 2D-structure with an extended version of the Zuker algorithm
ViennaRNA A tool compilation for prediction and comparison of RNA secondary structures Lorenz et al. 2011

▲ back to top

RNA alignment

Tool Description Reference
CMV RNA family model visualisation Eggenhofer et al. 2018
Compalignp An RNA counterpart of the protein specific "Benchmark Alignment Database" Wilm et al. 2006
LocARNA A tool for multiple alignment of RNA molecules Will et al. 2012
MAFFT A multiple sequence alignment program for unix-like operating systems Katoh and Standley 2016
RNAlien A tool for RNA family model construction Eggenhofer et al. 2016

▲ back to top

RNA annotation

Tool Description Reference
ARAGORN A tool to identify tRNA and tmRNA genes Laslett et al. 2004
FuMa (Fusion Matcher) A tool to reports identical fusion genes based on gene-name annotations Hoogstrate et al. 2015
GotohScan A search tool to find shorter sequences in large database sequences Hertel et al. 2009
Infernal Suite of tools for building RNA families covariance models (CMs) from structurally annotated sequence alignments Nawrocki et al. 2013
RNABOB A tool for fast pattern matching of RNA secondary structures Gautheret et al. 1990
RNAcode Predicts protein coding regions in a set of homologous nucleotide sequences Washietl et al. 2011
tRNAscan Searches for tRNA genes in genomic sequences Lowe et al. 1997
RCAS A generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments Uyar et al. 2017

▲ back to top

RNA-protein interaction

Tool Description Reference
AREsite2 A database for AU-/GU-/U-rich elements in human and model organisms Fallmann et al. 2015
doRiNA A database of RNA interactions in post-transcriptional regulation Blin et al. 2014
PARalyzer An algorithm to generate a map of interacting RNA-binding proteins and their targets Corcoran et al. 2011
Piranha A peak-caller for CLIP- and RIP-seq data

▲ back to top

RNA-RNA interaction

Tool Description Reference
IntaRNA Efficient RNA-RNA interaction prediction incorporating accessibility and seeding of interaction sites Mann et al. 2017

▲ back to top

RNA target prediction

Tool Description Reference
TargetFinder A tool to predict small RNA binding sites on target transcripts from a sequence database Fahlgren et al. 2009

▲ back to top

Ribosome profiling

Tool Description Reference
RiboTaper An analysis pipeline for Ribo-Seq experiments, exploiting the triplet periodicity of ribosomal footprints to call translated regions Calviello et al. 2015

▲ back to top

RNA-Seq and HTS analysis

Quality control

Tool Description Reference
FastQC! A quality control tool for high throughput sequence data
mQC A quality control tool for ribosome profiling mapping results Verbruggen and Menschaert 2017
MultiQC A tool to create reports visualising output from multiple tools across many samples Ewels et al. 2016
Trim Galore! A tool for the automation of quality and adapter trimming on paired-end or non-paired-end end sequences

▲ back to top

RNA-Seq

Tool Description Reference
Dr. Disco An analysis pipeline to detect genomic breakpoints in RNA-Seq data
FlaiMapper A tool for computational annotation of small ncRNA-derived fragments using RNA-seq data Hoogstrate et al. 2014
NASTIseq A method that incorporates the inherent variable efficiency of generating perfectly strand-specific libraries Li et al. 2013
PIPmiR An algorithm to identify novel plant miRNA genes from a combination of deep sequencing data and genomic features Breakfield et al. 2011
SortMeRNA A tool for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and -genomic data Kopylova et al. 2011

▲ back to top

Read mapping

Tool Description Reference
Bowtie2 Fast and sensitive read alignment Langmead et al. 2012
BWA Burrows-Wheeler Aligner for mapping low-divergent sequences against a large reference genome Li and Durbin 2010
BWA-MEM Fast and accurate long-read alignment with Burrows-Wheeler transform Li et al. 2010
HISAT2 Hierarchical indexing for spliced alignment of transcripts Kim et al. 2015
RNA STAR Rapid spliced aligner for RNA-seq data Dobin et al. 2013
STAR-fusion Fast fusion gene finder Haas et al. 2017

▲ back to top

Transcript assembly

Tool Description Reference
Trinity De novo transcript sequence reconstruction from RNA-Seq Haas et al. 2013

▲ back to top

Transcript quantification

Tool Description Reference
featureCounts Ultrafast and accurate read summarization program Liao et al. 2014
Sailfish Rapid alignment-free quantification of isoform abundance Patro et al. 2014
Salmon Fast, accurate and bias-aware transcript quantification Patro et al. 2017

▲ back to top

Differential expression analysis

Tool Description Reference
DESeq2 Differential gene expression analysis based on the negative binomial distribution Love et al. 2014

▲ back to top

Utilities

Tool Description Reference
SAMtools Utilities for manipulating alignments in the SAM format Heng et al. 2009
BEDTools Utilities for genome arithmetic Quinlan et al. 2010
deepTools A suite of tools for exploring hight-throughput sequencing data (HTS), such as ChIP-seq, RNA-seq, and MNase-seq Ramirez et al. 2016

▲ back to top

Training

To learn about RNA sequencing data analysis, we recommend you to have a look at the training material from the Galaxy Training network, particularly the tutorial on Reference-based RNA-seq data analysis.

In the Galaxy RNA workbench, we also included Galaxy interactive tours to guide you through the Galaxy, it's tools and possibilities.

▲ back to top

Contributors

▲ back to top

How to contribute

The RNA-workbench community welcomes new contributions and help in any way. We have collected detailed instructions and some guidance in our CONTRIBUTING.md.

Support and bug reports

For support, questions, or feature requests fill bug reports on our issue page.

▲ back to top

MIT license

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

▲ back to top