-
Notifications
You must be signed in to change notification settings - Fork 5
Home
- Fill in bullet point 4 on "Request for Umass RStudio server account/ install local Rstudio"
Section 1:
- The Unix Tutorial Part 2: (Piping &) Shell Scripting
- "Getting Data for Analysis" page
- Analysis of FASTQC report
- Trimmomatic > IGV pre-analysis and post-analysis
Section 2:
- Tophat2 + Cufflinks + Cuffmerge
Section 3:
- Visualization
Welcome to the bioinformatics workshop on RNA-seq data analysis. This page contains the materials and links to materials that you will need for the workshop.
This 2-day workshop aims to equip life-scientists with basic lab skills for scientific computing and to provide an introduction to RNA-seq data analysis followed by tutorials demonstrating the use of popular RNA-seq analysis packages. It will cover basic concepts and tools to leverage HPC environment (i.e. Linux and LSF command). Participants will be encouraged to help one another and to apply what they have learned to their own research problems.
- Recognize input and output file formats commonly used in RNA-seq data analysis
- Identify the major analysis steps in a typical RNA-seq data analysis pipeline
- Demonstrate a working example of RNA-seq data analysis pipeline based on Tuxedo protocol (Trapnell et al. 2012) using command-line software on Linux cluster environment (i.e MGHPCC)
- Understand major analysis components and standard tools used in a typical RNA-seq data analysis workflow
- Construct a workflow to answer specific research questions
- Evaluate and integrate new bioinformatics tools into a workflow
Undergraduates, graduates, postgraduates, and PIs working on analysis of RNA-seq data.
-
**You must have a UMass MGHPCC account to participate in the workshop. Follow the instructions here to request for an account from the MGHPCC Team and Research Computing/ Information Services at UMass Medical School. The request could take 1-5 business days to complete so be sure to plan ahead.
-
Fill in this Registration Form (Link to Google Form) after registering for your MGHPCC account.
-
Please bring your own laptop.
-
Request for Umass RStudio server account/ Installing R and RStudio on local computer.
-
Please complete the following simple Unix, Git, and R tutorials before attending:-
a. UNIX Tutorial for Beginners. (Tutorial 1-4)
b. Try the tutorial here to start learning R
c. Git is a version control system for tracking changes to files.
You may have used SVN, an older version control system. git is different in some important ways: git is distributed: everyone has a copy of the repository, so you can work offline and things happen fast. git is all about a branch-based workflow.
Here is an 15 min interactive tutorial to give you the basics.
Time | Note | |
---|---|---|
9:00 AM | HPC and set up | |
10:30 AM | Unix Shell | |
1:30 PM | Lunch Break | |
2:00 PM | TBA | |
1:00 PM | TBA | |
2:30 PM | TBA | |
4:00 pm | Questions and Feedback |
Time | Note | |
---|---|---|
9:00 AM | TBA | |
9:30 AM | TBA | |
10:30 AM | TBA | |
12:00 Noon | Lunch Break | |
1:00 PM | TBA | |
2:30 PM | TBA | |
4:00 pm | Questions and Feedback |
6-iii. Integrated assignment answers
#Table of Contents
- Module 0 Setting Up for Data Analysis
- Introduction to High Performance Computing Cluster
- Connecting to MGHPCC
- Computing Environment
- Unix Tutorial Part 1: UNIX Bootcamp
- Unix Tutorial Part 2: Shell Scripting
- Unix Tutorial Practice
- Submitting computing jobs to HPC using LSF
- Ignore: Git Tutorial
- Module 1 Introduction/ Overview
- Overview of RNA-seq Experiment
- RNA-Seq Analysis Pipeline
- RNA-Seq Input Data
- RNA-seq File Formats and Software-Specific Files
- Getting Data for Analysis
- Module 2 Quality Control
- Module 3 Tuxedo Pipeline
- The Tuxedo Pipeline
- Read Alignment with TopHat2
- Transcript Assembly with Cufflinks
- Differential Analysis with Cuffdiff
- Visualization with CummeRbund
- Resources and Reference