Benchmarking the Effectiveness and Accuracy of Multiple Mitochondrial DNA Variant Callers: Practical Implications for Clinical Application

Ip, Eddie K. K. and Troup, Michael and Xu, Colin and Winlaw, David S. and Dunwoodie, Sally L. and Giannoulatou, Eleni (2022) Benchmarking the Effectiveness and Accuracy of Multiple Mitochondrial DNA Variant Callers: Practical Implications for Clinical Application. Frontiers in Genetics, 13. ISSN 1664-8021

Available under License Creative Commons Attribution No Derivatives.

Download (1MB) | Preview
Link to published document:


Mitochondrial DNA (mtDNA) mutations contribute to human disease across a range of severity, from rare, highly penetrant mutations causal for monogenic disorders to mutations with milder contributions to phenotypes. mtDNA variation can exist in all copies of mtDNA or in a percentage of mtDNA copies and can be detected with levels as low as 1%. The large number of copies of mtDNA and the possibility of multiple alternative alleles at the same DNA nucleotide position make the task of identifying allelic variation in mtDNA very challenging. In recent years, specialized variant calling algorithms have been developed that are tailored to identify mtDNA variation from whole-genome sequencing (WGS) data. However, very few studies have systematically evaluated and compared these methods for the detection of both homoplasmy and heteroplasmy. A publicly available synthetic gold standard dataset was used to assess four mtDNA variant callers (Mutserve, mitoCaller, MitoSeek, and MToolBox), and the commonly used Genome Analysis Toolkit "best practices" pipeline, which is included in most current WGS pipelines. We also used WGS data from 126 trios and calculated the percentage of maternally inherited variants as a metric of calling accuracy, especially for homoplasmic variants. We additionally compared multiple pathogenicity prediction resources for mtDNA variants. Although the accuracy of homoplasmic variant detection was high for the majority of the callers with high concordance across callers, we found a very low concordance rate between mtDNA variant callers for heteroplasmic variants ranging from 2.8% to 3.6%, for heteroplasmy thresholds of 5% and 1%. Overall, Mutserve showed the best performance using the synthetic benchmark dataset. The analysis of mtDNA pathogenicity resources also showed low concordance in prediction results. We have shown that while homoplasmic variant calling is consistent between callers, there remains a significant discrepancy in heteroplasmic variant calling. We found that resources like population frequency databases and pathogenicity predictors are now available for variant annotation but still need refinement and improvement. With its peculiarities, the mitochondria require special considerations, and we advocate that caution needs to be taken when analyzing mtDNA data from WGS data.

Item Type: Article
Subjects: R Medicine > R Medicine (General)
Depositing User: Repository Administrator
Date Deposited: 07 Jun 2022 04:10
Last Modified: 17 Jun 2022 01:26

Actions (login required)

View Item View Item