Haplocheck: Contamination Detection in mtDNA and whole-genome sequencing studies

Today we are happy to announce haplocheck, our approach to detect contamination in mtDNA and even in whole-genome sequencing studies. By analysing only the small 17Kb mitochondrial genome (included as a by-product in each whole-genome sequencing run) haplocheck can be used as a fast proxy tool to estimate contamination. Therefore, only a fraction of the reads (i.e. chromosome MT mapped reads) need to be analysed. In principle, haplocheck works by detecting two different components (or mitochondrial haplotypes) in an input profile. Homoplasmies and heteroplasmies are detected, split into two profiles and analysed using the power of Haplogrep and the graph structure of Phylotree.

How does it work exactly?

Haplocheck works by detecting two different components (or mitochondrial haplotypes) within one sample. Each heteroplasmic position (detected using mutserve) is split into two components and added to a major and minor profile. Additionally, homoplasmic positions are added to both profiles. Each profile is then classified into a haplogroup (using Haplogrep). Since Haplogrep detects two components, we always output the contamination level of the minor and major component (Please have a look at the help page for further details).

Haplocheck Output

Haplocheck reports the contamination status for each mitochondrial input sample and creates (a) a graphical report and (b) a textual description.

The graphical report includes the most important information from the textual result file. The table can be filtered, sorted and searched by specific samples. Additionally, for each sample a phylogenetic tree is generated using the graph information from Phylotree 17. The tree starts at the root node (rCRS) and shows homoplasmic (blue) / heteroplasmic (green) positions for each transition until the final haplogroup (as assigned by Haplogrep) has been reached. The two branches represent the final haplogroups of the major and minor profile. In our case the two profiles are H1b and H3g1b.

The Mitoverse Platform

Haplocheck is available within our mitochondrial analysis platform called mitoverse. The overall goal of mitoverse is to provide mitochondrial tools (like Haplogrep or Haplocheck) as a service to the community. Even more exciting, the underlying architecture also allows to run the tool locally from the command line. Please visit https://mitoverse.i-med.ac.at to give haplocheck a try and let us know what you think.


Sebastian Schoenherr