Haplocheck detects in-sample contamination in mtDNA or WGS sequencing studies by analyzing the mitchondrial content. To run haplocheck, you can either use our cloud web service or install it locally.
The main features of haplocheck are:
- A fast tool to detect in-sample contaminaton by analyzing the mitochondrial content of sequencing data.
- Works both on VCF and BAM input files.
- It estimates contamination by detecting polymorphic sites in the mtDNA data and classifies them into mitochondrial haplogroups using haplogrep.
- It can be used as a proxy tool to estimate the nDNA contamination levels. Our results show that a high concordance to the 1000G contamination levels (using Verifybamid2) can be achieved but can vary in samples showing large differences in the mtDNA copy number (e.g. due to tissue/cell type).
mkdir haplocheck
wget https://github.com/genepi/haplocheck/releases/download/v1.3.3/haplocheck.zip
unzip haplocheck.zip
./haplocheck --out <out-file> <input-vcf>
curl -s install.cloudgene.io | bash -s 2.3.3
./cloudgene install https://github.com/genepi/haplocheck/releases/download/v1.3.2/haplocheck.zip
Full documentation for haplocheck can be found here.
Weissensteiner H, Forer L, Fendt L, Kheirkhah A, Salas A, Kronenberg F, Schoenherr S. 2021. Contamination detection in sequencing studies using the mitochondrial phylogeny. Genome Research. http://dx.doi.org/10.1101/gr.256545.119.
See here.
Check out our blog regarding mtDNA topics.
The script on how to create in-silico mixtures of two input samples can be found here.