May 20, 2015

MICCA: a complete and accurate software for taxonomic profiling of metagenomic data

Scientific Reports
Davide AlbaneseClaudio Donati

Abstract

The introduction of high throughput sequencing technologies has triggered an increase of the number of studies in which the microbiota of environmental and human samples is characterized through the sequencing of selected marker genes. While experimental protocols have undergone a process of standardization that makes them accessible to a large community of scientist, standard and robust data analysis pipelines are still lacking. Here we introduce MICCA, a software pipeline for the processing of amplicon metagenomic datasets that efficiently combines quality filtering, clustering of Operational Taxonomic Units (OTUs), taxonomy assignment and phylogenetic tree inference. MICCA provides accurate results reaching a good compromise among modularity and usability. Moreover, we introduce a de-novo clustering algorithm specifically designed for the inference of Operational Taxonomic Units (OTUs). Tests on real and synthetic datasets shows that thanks to the optimized reads filtering process and to the new clustering algorithm, MICCA provides estimates of the number of OTUs and of other common ecological indices that are more accurate and robust than currently available pipelines. Analysis of public metagenomic datasets shows that the ...Continue Reading

Mentioned in this Paper

Computer Software
Study
Trees (plant)
Environment
Nucleic Acid Sequencing
Metagenomics
Computer Programs and Programming
Sequencing
Microbiota (Procedure)
Microbial

Related Feeds

Biodiversity Data

Biodiversity refers to the variety and variability of life on Earth. Biodiversity is typically a measure of variation at the genetic, species, and ecosystem level.Discover the latest research on biodiversity data here.