BatMis: a fast algorithm for k-mismatch mapping

Bioinformatics
Chandana TennakoonWing-Kin Sung

Abstract

Second-generation sequencing (SGS) generates millions of reads that need to be aligned to a reference genome allowing errors. Although current aligners can efficiently map reads allowing a small number of mismatches, they are not well suited for handling a large number of mismatches. The efficiency of aligners can be improved using various heuristics, but the sensitivity and accuracy of the alignments are sacrificed. In this article, we introduce Basic Alignment tool for Mismatches (BatMis)--an efficient method to align short reads to a reference allowing k mismatches. BatMis is a Burrows-Wheeler transformation based aligner that uses a seed and extend approach, and it is an exact method. Benchmark tests show that BatMis performs better than competing aligners in solving the k-mismatch problem. Furthermore, it can compete favorably even when compared with the heuristic modes of the other aligners. BatMis is a useful alternative for applications where fast k-mismatch mappings, unique mappings or multiple mappings of SGS data are required. BatMis is written in C/C++ and is freely available from http://code.google.com/p/batmis/

References

Oct 5, 1990·Journal of Molecular Biology·S F AltschulD J Lipman
Jul 3, 2007·Nature·Tarjei S MikkelsenBradley E Bernstein
Jan 22, 2008·Nature Methods·LaDeana W HillierElaine R Mardis
Jan 30, 2008·Bioinformatics·T W LamS M Yiu
Jun 3, 2008·Nature Methods·Ali MortazaviBarbara Wold
Aug 8, 2008·Bioinformatics·Hao LinMing Li
Mar 6, 2009·Genome Biology·Ben LangmeadSteven L Salzberg
Mar 18, 2009·Bioinformatics·Cole TrapnellSteven L Salzberg
Jul 14, 2009·Genome Research·David WeeseKnut Reinert
Jan 19, 2010·Bioinformatics·Heng Li, Richard Durbin

Citations

Aug 20, 2014·Nature Communications·Andrew R BassettTudor A Fulga
Aug 29, 2012·Methods : a Companion to Methods in Enzymology·Jingyao ZhangYijun Ruan
Jul 15, 2015·Nucleic Acids Research·Jing-Quan LimWing-Kin Sung
Jun 19, 2017·Molecular Plant·Xianrong XieYao-Guang Liu
Aug 8, 2017·Nature·Hong MaShoukhrat Mitalipov
Jul 25, 2019·Genes·Guoliang LiQiangwei Zhou
Apr 17, 2020·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Pan ZouRui Chen
Aug 22, 2020·Nucleic Acids Research·Brendan VeenemanWenyan Zhong
Apr 1, 2017·BMC Bioinformatics·Chandana Tennakoon, Wing Kin Sung

Related Concepts

Computer Programs and Programming
Determination, Sequence Homology
Sequence Determinations, DNA
Computational Molecular Biology
Genomics
Cocaine
Genome
Protein K
Base
Act Relationship Type - Transformation

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.