Jul 8, 2016

VarMatch: robust matching of small variant datasets using flexible scoring schemes

BioRxiv : the Preprint Server for Biology
Chen Sun, Paul Medvedev


Small variant calling is an important component of many analyses, and, in many instances, it is important to determine the set of variants which appear in multiple callsets. Variant matching is complicated by variants that have multiple equivalent representations. Normalization and decomposition algorithms have been proposed, but are not robust to different representation of complex variants. Variant matching is also usually done to maximize the number of matches, as opposed to other optimization criteria. We present the VarMatch algorithm for the variant matching problem. Our algorithm is based on a theoretical result which allows us to partition the input into smaller subproblems without sacrificing accuracy. VarMatch is robust to different representation of complex variants and is particularly effective in low complexity regions or those dense in variants. It also implements different optimization criteria, such as edit distance, that can lead to different results and affect conclusions about algorithm performance. Finally, the VarMatch software provides summary statistics, annotations, and visualizations that are useful for understanding callers' performance. Availability: VarMatch is freely available at: https://github.com...Continue Reading

  • References
  • Citations


  • We're still populating references for this paper, please check back later.
  • References
  • Citations


  • This paper may not have been cited yet.

Mentioned in this Paper

Computer Software
Gene Mutant

About this Paper

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.