This pipeline was run in Linux (terminal on iOS).
To run Qiime, we need to run Conda first
Installation of Conda can be done from the Conda website
conda install conda-build
It is important to corroborate Conda is updated
conda update conda
conda update conda-build
And now, Qiime2
Here is the website to install Qiime2 The documentation includes detailed information about QIIME and how it works
QIIME 2 can be installed natively or using virtual machines using the conda environment, instructions are detailed here
- Open the terminal (Linux environment)
Tip: Some iOS updates set the zsh shell profile by default, we need to work on the bash environment (instead of zsh)
chsh -s /bin/bash
- Create metadata (manifest)
The manifest file is a text document with the following format: sample-id,absolute-filepath,direction
Below is an example of a manifest file, it is important to notice there are not blank spaces, not on the document nor on the file name sample-id,absolute-filepath,direction
SampleS7,/Users/user/Documents/Postdoc/Sequencing/Southport/S7_S13_L001_R2_001.fastq,forward
- Activate QIIME2
conda activate qiime2-2020.8
- Import metadata
qiime tools import --type 'SampleData[PairedEndSequencesWithQuality]' --input-path /Users/user/Documents/Southport/Sequencing/Parys_Mountain/manifest_PM.txt --output-path paired-end-demuxl.qza --input-format PairedEndFastqManifestPhred33
Notice that the input path will be different for each set of samples
- Visualize sequences
qiime demux summarize --i-data paired-end-demuxl.qza --p-n 2485316 --o-visualization demuxl.qzv
Notes: This command creates a file demuxl.qzv This file was created and stored in the User folder (I set this folder by default) The n represents the subsample size, that number was standardised for all the samples. This number is different for each set of samples based on the length of the sequences The file can now be dragged and visualized on qiime2view And then click on “Interactive Quality Plot”
This will show the quality score. Ideally, we want any sequence with higher quality than 33 (Phred) and that is why we set the commands for the next step, we will trim the edges of the sequences: 20 from the left and 220 to the right.
- Denoising with Dada2
qiime dada2 denoise-paired --i-demultiplexed-seqs paired-end-demuxl.qza --p-trim-left-f 20 --p-trim-left-r 20 --p-trunc-len-f 220 --p-trunc-len-r 220 --o-representative-sequences rep-seqs-dada2l.qza --o-table table-dada2l.qza --o-denoising-stats stats-dada2l.qza
The following files are created:
Saved FeatureTable[Frequency] to: table-dada2l.qza
Saved FeatureData[Sequence] to: rep-seqs-dada2l.qza
Saved SampleData[DADA2Stats] to: stats-dada2l.qza
Saved FeatureTable[Frequency] to: table-dada2l.qza
Saved FeatureData[Sequence] to: rep-seqs-dada2l.qza
Saved SampleData[DADA2Stats] to: stats-dada2l.qza
Notes
table-dada2.qza This file includes the table.biom file (OTUs and number of reads from each one of them). This file can also be useful to determine alpha diversity on R (remember to transpose data and subsampling)
rep-seqs-dada2.qza List of OTUs detected on the document, these files can be used to Blast and will be used to assign taxonomy
stats-dada2.qza includes statistics of the sequences, number of reads, merged and chimeric reads, this information may be useful to report on the methods part
- Obtain statistics
qiime metadata tabulate --m-input-file stats-dada2l.qza --o-visualization stats-dada2l.qzv
Saved Visualization to: stats-dada2l.qzv
- Assign taxonomy To assign taxonomy we need to download the databases and store them on the folder we have set by default. The available databases are Silva and Greengenes, both are found on the QIIME2 documentation site
Once the databases are installed, we can run the following commands (depending on the database used)
qiime feature-classifier classify-sklearn --i-classifier silva-138-99-nb-classifier.qza --i-reads rep-seqs-dada2l.qza --o-classification taxonomyl.qza
or
qiime feature-classifier classify-sklearn --i-gg-13-8-99-nb-classifier.qza --i-reads rep-seqs-dada2l.qza --o-classification taxonomyl.qza
The following file was created: Saved FeatureData[Taxonomy] to: taxonomyl.qza
- Classification by level
The Taxonomy file can be difficult to handle, hence we can create files classified by taxonomic level
Level 1: Domain
qiime taxa collapse \
--i-table table-dada2l.qza\
--i-taxonomy taxonomyl.qza \
--p-level 1 \
--o-collapsed-table feature-table-level1l.qza
Level 2: phylum
qiime taxa collapse \
--i-table table-dada2l.qza\
--i-taxonomy taxonomyl.qza \
--p-level 2 \
--o-collapsed-table feature-table-level2l.qza
Level 3: class
qiime taxa collapse \
--i-table table-dada2l.qza\
--i-taxonomy taxonomyl.qza \
--p-level 3 \
--o-collapsed-table feature-table-level3l.qza
Level 4: order
qiime taxa collapse \
--i-table table-dada2l.qza\
--i-taxonomy taxonomyl.qza \
--p-level 4 \
--o-collapsed-table feature-table-level4l.qza
Level 5: family
qiime taxa collapse \
--i-table table-dada2l.qza\
--i-taxonomy taxonomyl.qza \
--p-level 5 \
--o-collapsed-table feature-table-level5l.qza
Level 6: genus
qiime taxa collapse \
--i-table table-dada2l.qza\
--i-taxonomy taxonomyl.qza \
--p-level 6 \
--o-collapsed-table feature-table-level6l.qza
Level 7: species
qiime taxa collapse \
--i-table table-dada2l.qza\
--i-taxonomy taxonomyl.qza \
--p-level 7 \
--o-collapsed-table feature-table-level7l.qza
The qza files can be opened with a decompressor (I used The Unarchiver for MacOS). Inside those folders we can find a .biom file, that’s the file we need
- Conversion to readable tables
Note: the biom conversion needs to be run on the qiime2 environment
biom convert -i feature-table_level1l.biom -o table.from_biom_level1l.txt --to-tsv
biom convert -i feature-table_level2l.biom -o table.from_biom_level2l.txt --to-tsv
biom convert -i feature-table_level3l.biom -o table.from_biom_level3l.txt --to-tsv
biom convert -i feature-table_level4l.biom -o table.from_biom_level4l.txt --to-tsv
biom convert -i feature-table_level5l.biom -o table.from_biom_level5l.txt --to-tsv
biom convert -i feature-table_level6l.biom -o table.from_biom_level6l.txt --to-tsv
biom convert -i feature-table_level7l.biom -o table.from_biom_level7l.txt --to-tsv
- End the process
conda deactivate