Spherical: an iterative workflow for assembling metagenomic datasets

BioRxiv : the Preprint Server for Biology
Thomas C A Hitch, Chris Creevey

Abstract

The consensus emerging from microbiome studies is that they are far more complex than previously thought, requiring deep sequencing. As deep sequenced datasets provide greater coverage than previous datasets, recovering a higher proportion of reads to the assembly is still a challenge. To tackle this issue, we set of to identify if multiple iterations of assembly would allow for otherwise lost contigs to be formed and studied and if so, how successful is such an avenue at improving the current methodology. A simulated metagenomic dataset was initially used to identify if multiple iterations of assembly produce useable contigs or mis-assembled artefacts were produced. Once we had confirmed that the secondary iterations were producing both accurate contigs without a reduction in contig quality we applied this methodology in the form of Spherical to 3 metagenomic studies. The additional contigs produced by Spherical increased the number of reads aligning to an identified gene by 11-109% compared to the initial iterations assembly. As the size of the dataset increased, as did the amount of data multiple iterations were able to add.

Related Concepts

Genes
Spherical Shape
Size
Molecular Assembly/Self Assembly
Nucleic Acid Sequencing
Study

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.