Assessment of metagenomic assembly using simulated next generation sequencing data

PloS One
Daniel R MendePeer Bork

Abstract

Due to the complexity of the protocols and a limited knowledge of the nature of microbial communities, simulating metagenomic sequences plays an important role in testing the performance of existing tools and data analysis methods with metagenomic data. We developed metagenomic read simulators with platform-specific (Sanger, pyrosequencing, Illumina) base-error models, and simulated metagenomes of differing community complexities. We first evaluated the effect of rigorous quality control on Illumina data. Although quality filtering removed a large proportion of the data, it greatly improved the accuracy and contig lengths of resulting assemblies. We then compared the quality-trimmed Illumina assemblies to those from Sanger and pyrosequencing. For the simple community (10 genomes) all sequencing technologies assembled a similar amount and accurately represented the expected functional composition. For the more complex community (100 genomes) Illumina produced the best assemblies and more correctly resembled the expected functional composition. For the most complex community (400 genomes) there was very little assembly of reads from any sequencing technology. However, due to the longer read length the Sanger reads still represent...Continue Reading

References

Mar 1, 1973·Computer Programs in Biomedicine·D J Park, B E Wright
Mar 24, 2000·Science·E W MyersJ Craig Venter
Mar 10, 2001·Nature·E S LanderInternational Human Genome Sequencing Consortium
Dec 26, 2001·Bioinformatics·H H Chou, M H Holmes
Jul 27, 2002·Science·Samuel AparicioSydney Brenner
Jan 7, 2004·Genome Research·Mihai PopSteven L Salzberg
Apr 7, 2004·Science·J Craig VenterHamilton O Smith
Apr 23, 2005·Science·Susannah Green TringeEdward M Rubin
Mar 22, 2006·BMC Genomics·Robert EdwardsForest Rohwer
May 5, 2006·Applied and Environmental Microbiology·Carl B AbulenciaMartin Keller
Jun 3, 2006·Science·Steven R GillKaren E Nelson
Sep 26, 2006·Nature Biotechnology·Héctor García MartínPhilip Hugenholtz
Nov 9, 2006·PLoS Biology·Florent E AnglyForest Rohwer
Dec 13, 2006·Bioinformatics·René L WarrenRobert A Holt
May 1, 2007·Nature Methods·Konstantinos MavromatisNikos C Kyrpides
Aug 25, 2007·Proceedings of the National Academy of Sciences of the United States of America·E D HarringtonPeer Bork
Oct 6, 2007·DNA Research : an International Journal for Rapid Publication of Reports on Genes and Genomes·Ken KurokawaMasahira Hattori
Oct 13, 2007·Nucleic Acids Research·Victor M MarkowitzNikos C Kyrpides
Mar 20, 2008·Genome Research·Daniel R Zerbino, Ewan Birney
Apr 29, 2008·Applied and Environmental Microbiology·Thomas SchoenfeldDavid Mead
Jul 1, 2008·Nature Reviews. Microbiology·Jeroen Raes, Peer Bork
Oct 9, 2008·PloS One·Daniel C RichterDaniel H Huson
Oct 11, 2008·Science·Dylan ChivianTullis C Onstott
Oct 14, 2008·Journal of Bacteriology·Amoolya H SinghPeer Bork
Oct 28, 2008·Bioinformatics·Jason R MillerGranger Sutton
Dec 2, 2008·Nature·Peter J TurnbaughJeffrey I Gordon
Jan 24, 2009·Proceedings of the National Academy of Sciences of the United States of America·Tara A GianoulisMark B Gerstein
Apr 9, 2009·Nature Biotechnology·Matthias H TschöpChristopher L Karp
Jun 6, 2009·Bioinformatics·Ruiqiang LiJun Wang
Oct 13, 2009·Genome Research·Jane PetersonMark Guyer
Mar 10, 2010·Genomics·Jason R MillerGranger Sutton
Apr 21, 2010·Nucleic Acids Research·Wenhan ZhuMark Borodovsky
Sep 10, 2010·Biology Letters·Nick NeaveBernhard Fink
Sep 30, 2010·BMC Bioinformatics·Murray P CoxPatrick J Biggs
Oct 21, 2010·Bioinformatics·Manimozhiyan ArumugamPeer Bork
Feb 1, 2011·Bioinformatics·Robert Schmieder, Robert Edwards
Apr 22, 2011·Nature·Manimozhiyan ArumugamPeer Bork
May 20, 2011·Molecular Ecology Resources·Travis C Glenn
Jul 14, 2011·Journal of Microbiological Methods·Sajeet HaridasTom Hsiang
Aug 19, 2011·Microbial Biotechnology·Roland J Siezen, Michiel Kleerebezem
Jul 11, 2014·PLoS Computational Biology·Fredrik H KarlssonJens Nielsen

Citations

Oct 3, 2013·Bioinformatics·Dror HibshOrit Shefi
Jul 6, 2013·Bioinformatics·Sasha K AmesJonathan E Allen
May 31, 2014·Bioinformatics·Edward WijayaMichiaki Hamada
Nov 20, 2014·BMC Genomics·Daniel Aguirre de CárcerAntonio Alcamí
Nov 27, 2014·Nature Communications·Alexander J ProbstChristine Moissl-Eichinger
Oct 15, 2013·PloS One·Ben JiaChaochun Wei
Oct 23, 2013·PLoS Computational Biology·Rogan CarrElhanan Borenstein
Jul 10, 2012·Briefings in Bioinformatics·Tulika Prakash, Todd D Taylor
Jul 13, 2013·Folia Microbiologica·Jana NovákováMarian Farkašovský
Oct 22, 2013·Nature Methods·Shinichi SunagawaPeer Bork
Mar 10, 2016·Microbiome·Naseer SangwanJack A Gilbert
Sep 9, 2015·Molecular Ecology Resources·Bastian GreshakeIngo Ebersberger
Feb 19, 2013·Trends in Biotechnology·Kyle Bibby
Feb 25, 2015·Journal of Bioinformatics and Computational Biology·Scott E NixonSandra L Rodriguez-Zas
Aug 15, 2015·Microbial Biotechnology·Manuel FerrerPeter N Golyshin
Oct 10, 2015·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Yu-Qing QiuShihua Zhang
May 15, 2013·Molecular Systems Biology·Nicola SegataCurtis Huttenhower
Mar 21, 2016·Journal of Microbiological Methods·Antony T VincentSteve J Charette
Dec 19, 2013·International Journal of Cancer. Journal International Du Cancer·Érika CossetOlivier Preynat-Seauve
Sep 2, 2015·Asian-Australasian Journal of Animal Sciences·Ki Young ChoiWoo Jun Sul
May 6, 2016·Frontiers in Microbiology·Ankit GuptaVineet K Sharma
Aug 8, 2015·BMC Bioinformatics·Binbin LaiHuaiqiu Zhu
May 23, 2015·Science·Shinichi SunagawaPeer Bork
Feb 28, 2015·Functional & Integrative Genomics·Miriam LandDavid W Ussery
Jan 8, 2015·Frontiers in Microbiology·Saskia L SmitsAnita C Schürch
Jan 23, 2015·Frontiers in Microbiology·Hayssam SoueidanMacha Nikolski
Jul 2, 2014·Frontiers in Plant Science·Thomas J Sharpton
Oct 13, 2014·Bioinformatics·Jonathan D Magasin, Dietlind L Gerloff
Oct 20, 2012·PloS One·Jens Roat KultimaPeer Bork
Jan 11, 2013·BMC Bioinformatics·Nicolas MailletPierre Peterlongo
Nov 28, 2012·Current Opinion in Biotechnology·Terry C HazenStephen M Techtmann
Mar 17, 2017·Genome Research·Sergey NurkPavel A Pevzner
Mar 24, 2017·Bioinformatics·L SchaefferL Pachter
May 2, 2017·Molecular Biology and Evolution·Jaime Huerta-CepasPeer Bork
May 25, 2016·Molecular BioSystems·M F AddisP Moroni
Aug 16, 2016·BMC Genomics·Bhagya K WijayawardenaJ Andrew DeWoody
Sep 28, 2017·Nature·Jason Lloyd-PriceCurtis Huttenhower
Jan 4, 2018·BMC Bioinformatics·Damayanthi HerathSaman Kumara Halgamuge
Dec 15, 2015·Advanced Healthcare Materials·Lijun ZhangFeng Zhao
May 16, 2017·Beneficial Microbes·D Kamińska, M Gajecka
Nov 27, 2018·Statistical Methods in Medical Research·Viktor JonssonErik Kristiansson
Jan 25, 2018·BMC Bioinformatics·Thomas C A Hitch, Christopher J Creevey
Oct 27, 2017·Proceedings of the National Academy of Sciences of the United States of America·Alexander J ProbstGary L Andersen
Dec 16, 2017·Molecular Systems Biology·Paul I CosteaPeer Bork
Jul 13, 2019·Nature Communications·Alexander T DiltheyAdam M Phillippy
Oct 22, 2016·Journal of Molecular Medicine : Official Organ of the Gesellschaft Deutscher Naturforscher Und Ärzte·Matthias Willmann, Silke Peter
Feb 3, 2017·Indian Journal of Microbiology·Anukriti Sharma, Rup Lal
Oct 14, 2017·Briefings in Bioinformatics·Florian P BreitwieserSteven L Salzberg
Aug 3, 2018·Nature·Mohammad BahramPeer Bork
May 16, 2018·GigaScience·Hudan PanLiang Liu
Aug 23, 2020·Genes·Quang Tran, Vinhthuy Phan
Sep 9, 2017·Bioinformatics·Martina FischerBernhard Y Renard
Feb 10, 2019·Microbiome·Adrian FritzAlice C McHardy

Methods Mentioned

BETA
Illumina sequencing
PCA

Related Concepts

In Silico
DNA, Bacterial
Probability
Computer Programs and Programming
Genome, Bacterial
Sequence Determinations, DNA
Computational Molecular Biology
Contig Mapping
Genomics
Microbiome

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

The Tendon Seed Network

Tendons are rich in the extracellular matrix and are abundant throughout the body providing essential roles including structure and mobility. The transcriptome of tendons is being compiled to understand the micro-anatomical functioning of tendons. Discover the latest research pertaining to the Tendon Seed Network here.

Myocardial Stunning

Myocardial stunning is a mechanical dysfunction that persists after reperfusion of previously ischemic tissue in the absence of irreversible damage including myocardial necrosis. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Incretins

Incretins are metabolic hormones that stimulate a decrease in glucose levels in the blood and they have been implicated in glycemic regulation in the remission phase of type 1 diabetes. Here is the latest research.

Chromatin Regulation and Circadian Clocks

The circadian clock plays an important role in regulating transcriptional dynamics through changes in chromatin folding and remodelling. Discover the latest research on Chromatin Regulation and Circadian Clocks here.

Long COVID-19

“Long Covid-19” describes illness in patients who are reporting long-lasting effects of the SARS-CoV-19 infection, often long after they have recovered from acute Covid-19. Ongoing health issues often reported include low exercise tolerance and breathing difficulties, chronic tiredness, and mental health problems such as post-traumatic stress disorder and depression. This feed follows the latest research into Long Covid.

Spatio-Temporal Regulation of DNA Repair

DNA repair is a complex process regulated by several different classes of enzymes, including ligases, endonucleases, and polymerases. This feed focuses on the spatial and temporal regulation that accompanies DNA damage signaling and repair enzymes and processes.