Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega-reads algorithm

Genome Research
Aleksey V ZiminSteven L Salzberg

Abstract

Long sequencing reads generated by single-molecule sequencing technology offer the possibility of dramatically improving the contiguity of genome assemblies. The biggest challenge today is that long reads have relatively high error rates, currently around 15%. The high error rates make it difficult to use this data alone, particularly with highly repetitive plant genomes. Errors in the raw data can lead to insertion or deletion errors (indels) in the consensus genome sequence, which in turn create significant problems for downstream analysis; for example, a single indel may shift the reading frame and incorrectly truncate a protein sequence. Here, we describe an algorithm that solves the high error rate problem by combining long, high-error reads with shorter but much more accurate Illumina sequencing reads, whose error rates average <1%. Our hybrid assembly algorithm combines these two types of reads to construct mega-reads, which are both long and accurate, and then assembles the mega-reads using the CABOG assembler, which was designed for long reads. We apply this technique to a large data set of Illumina and PacBio sequences from the species Aegilops tauschii, a large and extremely repetitive plant genome that has resisted ...Continue Reading

References

Dec 29, 2017·Genes·Changsheng LiRuidong Huang
Oct 27, 2017·GigaScience·Aleksey V ZiminSteven L Salzberg
Jul 10, 2018·Integrative and Comparative Biology·Christopher E Laumer
Oct 3, 2018·The Plant Journal : for Cell and Molecular Biology·Joanna MelonekIan Small
Jan 14, 2018·BMC Research Notes·Galina KhafizovaTatiana Matveeva
Jan 24, 2019·TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·Awais Rasheed, Xianchun Xia
Nov 22, 2018·Briefings in Functional Genomics·YongKiat WeeMin Zhao
Nov 11, 2018·Journal of Integrative Plant Biology·Matthew HaasMartin Mascher
Nov 30, 2018·Annual Review of Animal Biosciences·Jose V LopezIliana B Baums
Feb 28, 2019·Science China. Life Sciences·José Ranz, Bryan Clifton
Mar 17, 2017·Nature·Michael W BevanMatthew D Clark
Jun 21, 2019·Genome Biology and Evolution·Florencia Díaz-ViraquéCarlos Robello
Nov 22, 2018·Microbial Genomics·Natalie RingStefan Bagby
Mar 20, 2020·GigaScience·Benjamin D RosenJuan F Medrano
May 28, 2020·Journal of Experimental Botany·Kathryn DumschottBjörn Usadel
Jun 27, 2020·PLoS Computational Biology·Aleksey V Zimin, Steven L Salzberg
Apr 3, 2019·Nature Biotechnology·Mikhail KolmogorovPavel A Pevzner
Jul 1, 2020·Nature Chemical Biology·Ilya A OstermanPetr V Sergiev
Jun 13, 2020·Microbiology Resource Announcements·Yekaterina AstafyevaInes Krohn
May 11, 2018·Nature·Hong-Qing LingChengzhi Liang
Jun 29, 2018·Bioinformatics·Weihua PanStefano Lonardi
Dec 7, 2018·Frontiers in Plant Science·Maria KyriakidouMartina V Strömvik
Sep 23, 2018·Genome Biology·Mona SchreiberMartin Mascher
Apr 12, 2019·Molecular Ecology Resources·Jean P ElbersPamela A Burger

Citations

May 30, 2002·Nucleic Acids Research·Arthur L DelcherSteven L Salzberg
Feb 5, 2004·Genome Biology·Stefan KurtzSteven L Salzberg
Oct 28, 2008·Bioinformatics·Jason R MillerGranger Sutton
Jul 4, 2012·Nature Biotechnology·Sergey Koren Adam M Phillippy
Dec 19, 2012·Nature Reviews. Genetics·Damon Lisch
May 7, 2013·Nature Methods·Chen-Shan ChinJonas Korlach
Aug 31, 2013·Bioinformatics·Aleksey V ZiminJames A Yorke
Jul 19, 2014·Science·International Wheat Genome Sequencing Consortium (IWGSC)
Aug 29, 2014·Bioinformatics·Leena Salmela, Eric Rivals
May 26, 2015·Nature Biotechnology·Konstantin BerlinAdam M Phillippy
Jun 30, 2016·Proceedings of the National Academy of Sciences of the United States of America·Jiaqiang DongJoachim Messing

Related Concepts

Genome
Alopecurus
Selfish DNA
Gene Deletion Abnormality
Gene Deletion
Nucleic Acid Sequencing
Bacterial Artificial Chromosomes
Evaluation
Computer Programs and Programming
Aegilops tauschii

Related Feeds

Artificial Chromosomes

Artificial chromosomes are genetically engineered chromosomes derived from the DNA of a species. Discover the latest research on artificial chromosomes here.