PALADIN: protein alignment for functional profiling whole metagenome shotgun data

Bioinformatics
Anthony WestbrookMatthew D MacManes

Abstract

Whole metagenome shotgun sequencing is a powerful approach for assaying the functional potential of microbial communities. We currently lack tools that efficiently and accurately align DNA reads against protein references, the technique necessary for constructing a functional profile. Here, we present PALADIN-a novel modification of the Burrows-Wheeler Aligner that provides accurate alignment, robust reporting capabilities and orders-of-magnitude improved efficiency by directly mapping in protein space. We compared the accuracy and efficiency of PALADIN against existing tools that employ nucleotide or protein alignment algorithms. Using simulated reads, PALADIN consistently outperformed the popular DNA read mappers BWA and NovoAlign in detected proteins, percentage of reads mapped and ontological similarity. We also compared PALADIN against four existing protein alignment tools: BLASTX, RAPSearch2, DIAMOND and Lambda, using empirically obtained reads. PALADIN yielded results seven times faster than the best performing alternative, DIAMOND and nearly 8000 times faster than BLASTX. PALADIN's accuracy was comparable to all tested solutions. PALADIN was implemented in C, and its source code and documentation are available at https:...Continue Reading

References

Oct 5, 1990·Journal of Molecular Biology·S F AltschulD J Lipman
Apr 5, 2002·Genome Research·W James Kent
May 20, 2009·Bioinformatics·Heng Li, Richard Durbin
Jul 16, 2009·Environmental Microbiology·Julien TapMarion Leclerc
Aug 17, 2010·Bioinformatics·Robert C Edgar
Dec 14, 2011·Current Opinion in Biotechnology·Matthew B ScholzPatrick S G Chain
Dec 27, 2011·Bioinformatics·Weichun HuangGabor T Marth
Mar 6, 2012·Nature Methods·Ben Langmead, Steven L Salzberg
Mar 14, 2012·Nature Reviews. Genetics·Ilseung Cho, Martin J Blaser
Jan 30, 2013·Nature Reviews. Genetics·Niranjan Nagarajan, Mihai Pop
Mar 16, 2013·Clinical Microbiology and Infection : the Official Publication of the European Society of Clinical Microbiology and Infectious Diseases·E Sentausa, P-E Fournier
Aug 28, 2014·Bioinformatics·Hannes HauswedellKnut Reinert
Nov 18, 2014·Nature Methods·Benjamin BuchfinkDaniel H Huson
Oct 27, 2016·PeerJ·Torbjørn RognesFrédéric Mahé

Citations

Dec 7, 2017·Bioinformatics·Askarbek N OrakovIgor I Goryanin
Aug 11, 2018·Environmental Toxicology and Chemistry·Ondrej AdamovskyChristopher J Martyniuk
Apr 4, 2019·GigaScience·Stuart M BrownKonstantinos Krampis
Jun 4, 2019·Environmental Microbiology·Moussa LouatiMaher Gtari
Jul 25, 2020·Microbial Genomics·Ana Elena Pérez-CobasCarmen Buchrieser
Jan 2, 2021·PeerJ·Célio Dias Santos-JúniorLuis Pedro Coelho

Related Concepts

Computer Programs and Programming
Determination, Sequence Homology
Sequence Determinations, DNA
Metagenomics
High-Throughput Nucleotide Sequencing
Microbiota (plant)
DNA
Nucleotides
Anatomical Space Structure
Device, Prosthesis Alignment

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Related Papers

Journal of Child Neurology
G F Short
The Journal of Adhesive Dentistry
Jean-François Roulet
Tanpakushitsu kakusan koso. Protein, nucleic acid, enzyme
K Kawasaki, N Shimizu
Annales de chirurgie plastique
S Atallah, O Quenard
© 2021 Meta ULC. All rights reserved