An extension to identify viral elements in Kraken 2 outputs
Report Bug
·
Request Feature
Table of Contents
This project was created to identify viral contigs using Kraken 2 in the publication Cite. This extension allows a user to get the headers of viral elements and provides the ability to retain only those headers or remove them if the sequencing file is provided.
To get a local copy up and running follow these simple steps.
VirKraken requires Python 3 and the following libraries (if installling through pip, libraries are automatically install)
- pandas
- scikit-learn
- biopython
- importlib-resources
VirKraken is aviliable on PyPI and can be forked on this repository. The easiest way to install VirKraken is to use pip.
pip install virkraken
VirKraken works as a command line script. Once install via pip, virkraken the command can be accessed. To get the help screen type:
virkraken -h
The paramters of VirKraken are:
- -f: Kraken output file [required]
- -c: Seqeuncing file to parse [optional]
- -r: Remove viral elements flag
- -o: Rename output files [optional]
virkraken -f Kraken_Output.txt -o Viral_Sequences
The script above will return Viral_Sequences.csv which will contain a column of sequnce headers and NCBI TaxIDs. All returned sequence headers are viral.
virkraken -f Kraken_Output.txt -c final.contigs.fa -o Viral_Sequences
The script above will return Viral_Sequences.csv and Viral_Sequences.fasta. All returned sequences are viral. VirKraken will filter the input fasta for sequence headers matching the predicted viral headers.
virkraken -r -f Kraken_Output.txt -c final.contigs.fa -o Filtered_Sequences
Seqeunce headers that are assigned a viral designation are removed from the resulting fasta file output.
Current Version: 0.0.5
Improvements to be made:
- Fix .gz contig fasta outfile
- Allow paired .fastq input
- Integrate into Kraken 2 codebase
See the open issues for a list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
Distributed under the MIT License. See LICENSE
for more information.
Cody Glickman - @glickman_Cody - [email protected]
Project Link: https://github.com/Strong-Lab/VirKraken
- Jo Hendrix