WiseScaffolder: an algorithm for the semi-automatic scaffolding of Next Generation Sequencing data

BMC Bioinformatics
Gregory K FarrantLaurence Garczarek

Abstract

The sequencing depth provided by high-throughput sequencing technologies has allowed a rise in the number of de novo sequenced genomes that could potentially be closed without further sequencing. However, genome scaffolding and closure require costly human supervision that often results in genomes being published as drafts. A number of automatic scaffolders were recently released, which improved the global quality of genomes published in the last few years. Yet, none of them reach the efficiency of manual scaffolding. Here, we present an innovative semi-automatic scaffolder that additionally helps with chimerae resolution and generates valuable contig maps and outputs for manual improvement of the automatic scaffolding. This software was tested on the newly sequenced marine cyanobacterium Synechococcus sp. WH8103 as well as two reference datasets used in previous studies, Rhodobacter sphaeroides and Homo sapiens chromosome 14 (http://gage.cbcb.umd.edu/). The quality of resulting scaffolds was compared to that of three other stand-alone scaffolders: SSPACE, SOPRA and SCARPA. For all three model organisms, WiseScaffolder produced better results than other scaffolders in terms of contiguity statistics (number of genome fragments, ...Continue Reading

References

Aug 15, 2003·Nature·B PalenikJ Waterbury
Jan 7, 2004·Genome Research·Mihai PopSteven L Salzberg
Jun 25, 2004·Nucleic Acids Research·Scott McGinnis, Thomas L Madden
Mar 20, 2008·Genome Research·Daniel R Zerbino, Ewan Birney
Apr 23, 2008·Current Protocols in Bioinformatics·Arthur L DelcherAdam M Phillippy
Oct 28, 2008·Bioinformatics·Jason R MillerGranger Sutton
Jun 6, 2009·Bioinformatics·Samuel AssefaMatthew Berriman
Jun 10, 2009·Bioinformatics·Heng LiUNKNOWN 1000 Genome Project Data Processing Subgroup
Jun 26, 2010·BMC Bioinformatics·Adel DayarianAnirvan M Sengupta
Aug 3, 2010·PloS One·Sébastien RodrigueSallie W Chisholm
Sep 14, 2010·BMC Bioinformatics·Sergey KorenGranger Sutton
Dec 15, 2010·Bioinformatics·Marten BoetzerWalter Pirovano
Sep 20, 2011·Bioinformatics·Sergey KorenMihai Pop
Oct 15, 2011·Bioinformatics·Leena SalmelaEsko Ukkonen
Dec 8, 2011·Genome Research·Steven L SalzbergJames A Yorke
Dec 14, 2011·Current Opinion in Biotechnology·Matthew B ScholzPatrick S G Chain
Mar 6, 2012·Nature Methods·Ben Langmead, Steven L Salzberg
May 30, 2012·Source Code for Biology and Medicine·Michael D Barton, Hazel A Barton
Jun 27, 2012·Genome Biology·Marten Boetzer, Walter Pirovano
Jan 1, 2013·Bioinformatics·Nilgun Donmez, Michael Brudno
Feb 21, 2013·Bioinformatics·Alexey GurevichGlenn Tesler
Jul 5, 2013·PloS One·Michael D Barton, Hazel A Barton
Dec 4, 2013·Bioinformatics·Richard M LeggettMario Caccamo
Mar 4, 2014·Genome Biology·Martin HuntThomas D Otto

❮ Previous
Next ❯

Citations

Aug 18, 2018·Environmental Microbiology·Pedro J Cabello-YevesFrancisco Rodriguez-Valera
Sep 13, 2017·Genetics and Molecular Biology·Frederico Schmitt KremerLuciano da Silva Pinto
Feb 27, 2021·Briefings in Bioinformatics·Junwei LuoChaokun Yan

❮ Previous
Next ❯

Datasets Mentioned

BETA
ERP006796
LN847356.1

Methods Mentioned

BETA
PCR

Software Mentioned

WiseScaffolder
CLC
CLC assembler
BioPython
MP
Galaxy
SAM
WiseScaffolfer
SOPRA
SCARPA

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.