May 20, 2009

Fast and accurate short read alignment with Burrows-Wheeler transform

Bioinformatics
Heng Li, Richard Durbin

Abstract

The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows-Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is approximately 10-20x faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Va...Continue Reading

  • References13
  • Citations9847

Citations

Mentioned in this Paper

Computer Programs and Programming
DNA Resequencing
Genomics
Sequencing
Sequence Determinations, DNA
Genome, Human
Determination, Sequence Homology
Sequence Alignment
DNA Sequence

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Autism: Motor Learning

A common feature of autism spectrum disorder (ASD) is the impairment of motor control and learning, consistent with perturbation in cerebellar function. Find the latest research on ASD and motor learning here.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

Sexual Dimorphism in Neurodegeneration

There exist sex differences in neurodevelopmental and neurodegenerative disorders. For instance, multiple sclerosis is more common in women, whereas Parkinson’s disease is more common in men. Here is the latest research on sexual dimorphism in neurodegeneration

Protein Localization in Disease & Therapy

Localization of proteins is critical for ensuring the correct location for physiological functioning. If an error occurs, diseases such as cardiovascular, neurodegenerative disorders and cancers can present. Therapies are being explored to target this mislocalization. Here is the latest research on protein localization in disease and therapy.

Genetic Screens in Bacteria

Genetic screens can provide important information on gene function as well as the molecular events that underlie a biological process or pathway. Here is the latest research on genetic screens in bacteria.

Head And Neck Squamous Cell Carcinoma

Squamous cell carcinomas account for >90% of all tumors in the head and neck region. Head and neck squamous cell carcinoma incidence has increased dramatically recently with little improvement in patient outcomes. Here is the latest research on this aggressive malignancy.

Artificial Intelligence in Cardiac Imaging

Artificial intelligence (ai) techniques are increasingly applied to cardiovascular (cv) medicine in cardiac imaging analysis. Here is the latest research.