HaploMerger: reconstructing allelic relationships for polymorphic diploid genome assemblies.

Genome Research
Shengfeng HuangAnlong Xu

Abstract

Whole-genome shotgun assembly has been a long-standing issue for highly polymorphic genomes, and the advent of next-generation sequencing technologies has made the issue more challenging than ever. Here we present an automated pipeline, HaploMerger, for reconstructing allelic relationships in a diploid assembly. HaploMerger combines a LASTZ-ChainNet alignment approach with a novel graph-based structure, which helps to untangle allelic relationships between two haplotypes and guides the subsequent creation of reference haploid assemblies. The pipeline provides flexible parameters and schemes to improve the contiguity, continuity, and completeness of the reference assemblies. We show that HaploMerger produces efficient and accurate results in simulations and has advantages over manual curation when applied to real polymorphic assemblies (e.g., 4%-5% heterozygosity). We also used HaploMerger to analyze the diploid assembly of a single Chinese amphioxus (Branchiostoma belcheri) and compared the resulting haploid assemblies with EST sequences, which revealed that the two haplotypes are not only divergent but also highly complementary to each other. Taken together, we have demonstrated that HaploMerger is an effective tool for analyz...Continue Reading

References

Sep 1, 1997·Nucleic Acids Research·S F AltschulD J Lipman
Mar 24, 2000·Science·E W MyersJ C Venter
Jul 27, 2002·Science·Samuel AparicioSydney Brenner
Oct 5, 2002·Science·Robert A HoltStephen L Hoffman
Jan 17, 2003·Genome Research·Scott SchwartzWebb Miller
Sep 23, 2003·Proceedings of the National Academy of Sciences of the United States of America·W James KentDavid Haussler
May 5, 2004·Proceedings of the National Academy of Sciences of the United States of America·Ted JonesStewart Scherer
Aug 4, 2005·Genome Research·Jade P VinsonEric S Lander
Nov 17, 2005·Bioinformatics·Aleksandr MorgulisRicha Agarwala
Apr 19, 2008·Nature·David A WheelerJonathan M Rothberg
Jun 20, 2008·Nature·Nicholas H PutnamDaniel S Rokhsar
Oct 28, 2008·Bioinformatics·Jason R MillerGranger Sutton
Jan 21, 2009·Bioinformatics·Jérôme GouzyThomas Schiex
Mar 3, 2009·Genome Research·Jared T SimpsonInanç Birol
May 12, 2009·Nucleic Acids Research·Leming ZhouLiliana Florea
Jun 2, 2009·Briefings in Bioinformatics·Mihai Pop
Apr 29, 2010·Genome Biology·Bolan Linghu, Charles DeLisi
Dec 1, 2010·Genome Biology·David R KelleySteven L Salzberg
Dec 21, 2010·Nature Biotechnology·Jacob O KitzmanJay Shendure
Feb 10, 2011·PloS One·Timothée FlutreHadi Quesneville
May 10, 2011·Bioinformatics·Brian Walenz, Liliana Florea

❮ Previous
Next ❯

Citations

Feb 3, 2016·Journal of Experimental Zoology. Part B, Molecular and Developmental Evolution·Hui-Su KimJae-Seong Lee
May 7, 2015·Molecular Ecology Resources·Christophe PlomionAntoine Kremer
Mar 5, 2016·BMC Evolutionary Biology·Myles G GarstangDavid E K Ferrier
Mar 20, 2013·Molecular Ecology·Asher D CutterAlivia Dey
Nov 14, 2014·Genesis : the Journal of Genetics and Development·Quirino Attilio VassalliAnnamaria Locascio
Aug 1, 2015·Genome Biology and Evolution·Martijn F L DerksHendrik-Jan Megens
Mar 4, 2015·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Yana SafonovaPavel A Pevzner
May 1, 2016·Nucleic Acids Research·Leszek P Pryszcz, Toni Gabaldón
Aug 10, 2016·DNA Research : an International Journal for Rapid Publication of Reports on Genes and Genomes·Koki NishitsujiEiichi Shoguchi
Mar 4, 2017·Molecular Ecology·Benjamin M Van DorenMiriam Liedvogel
Feb 17, 2018·Molecular Ecology Resources·Hui-Su KimJae-Seong Lee
Feb 17, 2018·Molecular Ecology Resources·Hui-Su KimJae-Seong Lee
Nov 23, 2018·Nature·Ferdinand MarlétazManuel Irimia
Oct 28, 2018·Applied and Environmental Microbiology·Facundo GiorelloFrancisco Carrau
Aug 12, 2018·Nature Communications·Natsumi KanzakiTaisei Kikuchi
Jul 28, 2019·Genome Biology and Evolution·Tsai-Ming LuNoriyuki Satoh
Jan 4, 2018·Development Genes and Evolution·Myles G Garstang, David E K Ferrier
Aug 31, 2019·Plant Cell Reports·Sachiko IsobeHideki Hirakawa
Aug 23, 2019·PLoS Computational Biology·Jay GhuryeSergey Koren
May 18, 2019·Microbiology Resource Announcements·Jongbum JeonSoonok Kim
Sep 25, 2019·Nature Microbiology·Steven J RobbinsDavid G Bourne
Oct 2, 2020·PLoS Neglected Tropical Diseases·Pasi K KorhonenKatja Fischer
Dec 12, 2018·Microbiology Resource Announcements·Jaeduk GohNamil Chung
May 21, 2019·Database : the Journal of Biological Databases and Curation·Leiming YouAnlong Xu
May 7, 2019·Genome Biology and Evolution·Hua YingDavid J Miller

❮ Previous
Next ❯

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.