OPERA-LG: efficient and exact scaffolding of large, repeat-rich eukaryotic genomes with performance guarantees

Genome Biology
Song GaoNiranjan Nagarajan

Abstract

The assembly of large, repeat-rich eukaryotic genomes represents a significant challenge in genomics. While long-read technologies have made the high-quality assembly of small, microbial genomes increasingly feasible, data generation can be expensive for larger genomes. OPERA-LG is a scalable, exact algorithm for the scaffold assembly of large, repeat-rich genomes, out-performing state-of-the-art programs for scaffold correctness and contiguity. It provides a rigorous framework for scaffolding of repetitive sequences and a systematic approach for combining data from different second-generation and third-generation sequencing technologies. OPERA-LG provides an avenue for systematic augmentation and improvement of thousands of existing draft eukaryotic genome assemblies.

References

Mar 24, 2000·Science·E W MyersJ C Venter
Jul 24, 2002·Nucleic Acids Research·Kazutaka KatohTakashi Miyata
Jan 7, 2004·Genome Research·Mihai PopSteven L Salzberg
Dec 8, 2005·Bioinformatics·Steven L Salzberg, James A Yorke
Oct 9, 2008·PloS One·Daniel C RichterDaniel H Huson
May 20, 2009·Bioinformatics·Heng Li, Richard Durbin
Jul 8, 2009·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Niranjan Nagarajan, Mihai Pop
Aug 4, 2009·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Paul Medvedev, Michael Brudno
Jun 26, 2010·BMC Bioinformatics·Adel DayarianAnirvan M Sengupta
Dec 15, 2010·Bioinformatics·Marten BoetzerWalter Pirovano
Dec 29, 2010·Proceedings of the National Academy of Sciences of the United States of America·Sante GnerreDavid B Jaffe
Dec 31, 2010·Nature Methods·Ewan Birney
Mar 10, 2011·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Jonathan LasersonDaphne Koller
Aug 2, 2011·Nature Biotechnology·Xun XuJun Wang
Aug 31, 2011·PloS One·Jarrod A ChapmanDaniel S Rokhsar
Sep 20, 2011·Genome Research·Dent EarlBenedict Paten
Sep 21, 2011·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Song GaoNiranjan Nagarajan
Oct 15, 2011·Bioinformatics·Leena SalmelaEsko Ukkonen
Dec 8, 2011·Genome Research·Steven L SalzbergJames A Yorke
Jun 27, 2012·Genome Biology·Marten Boetzer, Walter Pirovano
Jul 4, 2012·Nature Biotechnology·Ali BashirEric E Schadt
Jul 4, 2012·Nature Biotechnology·Sergey Koren Adam M Phillippy
Aug 28, 2012·Bioinformatics·Kristoffer SahlinLars Arvestad
Nov 7, 2012·Bioinformatics·Yukiteru OnoMichiaki Hamada
Nov 28, 2012·Nature Genetics·Qiang XuYijun Ruan
Jan 30, 2013·Nature Reviews. Genetics·Niranjan Nagarajan, Mihai Pop
Jan 31, 2013·Genome Biology·Atif Rahman, Lior Pachter
May 29, 2013·Genome Biology·Martin HuntThomas D Otto
Nov 6, 2013·BMC Bioinformatics·Guy BreslerDavid Tse
Aug 24, 2013·BMC Research Notes·Mohammadreza GhodsiMihai Pop
Mar 4, 2014·Genome Biology·Martin HuntThomas D Otto
Jun 22, 2014·BMC Bioinformatics·Marten Boetzer, Walter Pirovano
Aug 17, 2014·BMC Bioinformatics·Kristoffer SahlinLars Arvestad
Jul 26, 2015·BMC Bioinformatics·Daniel PaulinoInanç Birol
Aug 6, 2015·GigaScience·René L WarrenInanç Birol
Apr 16, 2016·Nature Communications·Ivan SovićNiranjan Nagarajan

Citations

Oct 16, 2016·Briefings in Bioinformatics·Jang-Il Sohn, Jin-Wu Nam
Mar 17, 2018·Bioinformatics·Igor MandricAlex Zelikovsky
May 10, 2018·GigaScience·Alex Di GenovaAlejandro Maass
Nov 27, 2018·Bioinformatics·Olga Kunyavskaya, Andrey D Prjibelski
Nov 22, 2018·Briefings in Functional Genomics·YongKiat WeeMin Zhao
Jun 6, 2019·PLoS Computational Biology·Jay Ghurye, Mihai Pop
Aug 28, 2019·Genome Biology·Jay GhuryeMihai Pop
Jul 11, 2019·Journal of Bioinformatics and Computational Biology·Rumen AndonovDominique Lavenier
May 22, 2019·BioMed Research International·Wiktor KuśmirekRobert Nowak
Sep 25, 2019·Plants·Sara L MartinTracey James
Dec 11, 2019·BMC Genomics·Mao QinJue Ruan
Aug 25, 2020·Genome Biology and Evolution·Andrey A YurchenkoKathryn R Elmer

Related Concepts

Metazoa
Selfish DNA
Computer Programs and Programming
Sequence Determinations, DNA
Contig Mapping
Genome Size
Genome
Repeated Measures
Molecular Assembly/Self Assembly
Genomics

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.