As part of the EANBIT Virtual Residential Training 2020, this group will be working on a mini-project towards developing a snakemake workflow for RNA-Seq data processing and gene expression analysis. The detailed documentation is provided here
- Ruth Nanjala (Group Lead)
- Kakembo Fredrick Elishama
- Eric G. Kairuki
- Stella E. Nabirye
- Senamile Fezile Dlamini
- Mthande S. Mzwakhile
- Monica Mbabazi
- Ritah Nabunje
- Pre-processing of the reads
- Quality Check using Fastqc and
multiqc
- Trimming of poor quality bases and filtering short reads using Trim_galore
- Quality check using
multiqc
- Quality Check using Fastqc and
- Alignment of samples to the reference. Two approaches were used;
- Classical alignment using
Hisat2
- Count generation using Subreads feature count
-Quality check using
multiqc
- Count generation using Subreads feature count
-Quality check using
- Pseudo-alignment using
Kallisto
- Classical alignment using
- Differential Expression Analysis in R using DeSEq
- Converting the pipeline to R Markdown and Snakemake
A summary report of the workflow is documented on the Wiki page with the following sections;
- RNA Seq Workflow
- RNA Seq Workflow_Introduction
- RNA Seq Workflow_Bash Script
- RNA Seq Workflow_R Markdown Script.
- Link to the output report is here