Jun 22, 2017

Mapping-free variant calling using haplotype reconstruction from k-mer frequencies

BioRxiv : the Preprint Server for Biology
Peter AudanoFredrik Vannberg

Abstract

Motivation: The standard protocol for detecting variation in DNA is to map millions of short sequence reads to a known reference and find loci that differ. While this approach works well, it cannot be applied where the sample contains dense variants or is too distant from known references. De novo assembly or hybrid methods can recover genomic variation, but the cost of computation is often much higher. We developed a novel k-mer algorithm and software implementation, Kestrel, capable of characterizing densely-packed SNPs and large indels without mapping, assembly, or de Bruijn graphs. Results: When applied to mosaic penicillin binding protein (PBP) genes in Streptococcus pneumoniae, we found near perfect concordance with assembled contigs at a fraction of the CPU time. Multilocus sequence typing (MLST) with this approach was able to bypass de novo assemblies. Kestrel has a very low false-positive rate when calling variants over the whole genome, but limitations of a purely k-mer based approach affect sensitivity. Availability: Source code and documentation for a Java implementation of Kestrel can be found at https://github.com/paudano/kestrel. All test code for this publication is located at https://github.com/paudano/kescases.

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Computer Software
Genome
DNA Repair
Genes
Reconstructive Surgical Procedures
Spatial Mosaic
Platelet Basic Protein, human
Multilocus Sequence Typing
Genomics
Zaglossus bruijni

About this Paper

Related Feeds

Bacterial Pneumonia (ASM)

Bacterial pneumonia is a prevalent and costly infection that is a significant cause of morbidity and mortality in patients of all ages. Here is the latest research.

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.

Bacterial Pneumonia

Bacterial pneumonia is a prevalent and costly infection that is a significant cause of morbidity and mortality in patients of all ages. Here is the latest research.

Related Papers

BioRxiv : the Preprint Server for Biology
Reda Younsi, Dan MacLean
Nature Biotechnology
Phillip E C CompeauGlenn Tesler
BioRxiv : the Preprint Server for Biology
Daniel S StandageFereydoun Hormozdiari
© 2020 Meta ULC. All rights reserved