Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.

PloS One
Matt J CahillJohn A C Archer

Abstract

There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our re...Continue Reading

References

Oct 5, 1990·Journal of Molecular Biology·S F AltschulD J Lipman
Nov 24, 2001·Molecular Biology and Evolution·G AchazE Coissac
May 9, 2002·Genome Research·Elaine MardisW Richard McCombie
May 30, 2002·Nucleic Acids Research·Arthur L DelcherSteven L Salzberg
Jun 11, 2002·International Journal of Systematic and Evolutionary Microbiology·Erko StackebrandtWilliam B Whitman
Nov 12, 2002·Journal of Bacteriology·Claire M FraserSteven L Salzberg
Feb 5, 2004·Genome Biology·Stefan KurtzSteven L Salzberg
Aug 2, 2005·Nature·Marcel MarguliesJonathan M Rothberg
Nov 9, 2005·Nucleic Acids Research·Nava WhitefordCameron Neylon
Dec 26, 2006·BMC Bioinformatics·Bernhard Haubold, Thomas Wiehe
Apr 24, 2007·The Journal of Experimental Biology·Neil Hall
Dec 18, 2007·Genome Research·Mark J Chaisson, Pavel A Pevzner
Feb 12, 2008·Trends in Genetics : TIG·Elaine R Mardis
Feb 12, 2008·Trends in Genetics : TIG·Mihai Pop, Steven L Salzberg
Mar 20, 2008·Genome Research·Daniel R Zerbino, Ewan Birney
Aug 16, 2008·Genomics·Olena Morozova, Marco A Marra
Sep 27, 2008·PLoS Computational Biology·Steven L SalzbergVincent T Lee
Oct 11, 2008·Nature Biotechnology·Jay Shendure, Hanlee Ji
Nov 22, 2008·Science·John EidStephen Turner
Feb 26, 2009·BMC Bioinformatics·Douglas W BryantTodd C Mockler
Mar 17, 2009·Nature Reviews. Microbiology·Daniel MacLeanDavid J Studholme
Apr 28, 2009·FEMS Microbiology Reviews·Todd J TreangenEduardo P C Rocha
Jan 13, 2010·BMC Bioinformatics·Carl KingsfordMihai Pop
Aug 11, 2010·FEMS Microbiology Letters·David J StudholmeJonathan D G Jones

❮ Previous
Next ❯

Citations

Dec 29, 2013·The Journal of Antimicrobial Chemotherapy·Claudio U KöserSharon J Peacock
May 27, 2011·Analytical Chemistry·Thomas P NiedringhausAnnelise E Barron
Feb 5, 2013·Bioinformatics·Susanne BalzerInge Jonassen
Oct 25, 2011·Briefings in Bioinformatics·Francesca FinotelloStefano Toppo
Jun 19, 2013·Genome Biology·Geo Velikkakam JamesKorbinian Schneeberger
Jan 15, 2013·Briefings in Functional Genomics·Brian M Forde, Paul W O'Toole
Nov 26, 2010·Molecular Biology Reports·Yong HuangXing Jia Shen
Mar 27, 2013·Annals of the New York Academy of Sciences·Claudio Donati, Rino Rappuoli
Sep 23, 2014·Computational Biology and Chemistry·Wentian Li, Jan Freudenberg
Nov 4, 2016·G3 : Genes - Genomes - Genetics·Daniel Gonzalez-IbeasJill L Wegrzyn
Jul 5, 2011·Infection, Genetics and Evolution : Journal of Molecular Epidemiology and Evolutionary Genetics in Infectious Diseases·Claudio U KöserJohn A C Archer

❮ Previous
Next ❯

Methods Mentioned

BETA
Illumina sequencing
PCR

Software Mentioned

BLASTN
Velvet
Math
MUMmer
BLAST
MUMmer package

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.