Probabilistic error correction for RNA sequencing

Nucleic Acids Research
Hai-Son LeZiv Bar-Joseph

Abstract

Sequencing of RNAs (RNA-Seq) has revolutionized the field of transcriptomics, but the reads obtained often contain errors. Read error correction can have a large impact on our ability to accurately assemble transcripts. This is especially true for de novo transcriptome analysis, where a reference genome is not available. Current read error correction methods, developed for DNA sequence data, cannot handle the overlapping effects of non-uniform abundance, polymorphisms and alternative splicing. Here we present SEquencing Error CorrEction in Rna-seq data (SEECER), a hidden Markov Model (HMM)-based method, which is the first to successfully address these problems. SEECER efficiently learns hundreds of thousands of HMMs and uses these to correct sequencing errors. Using human RNA-Seq data, we show that SEECER greatly improves on previous methods in terms of quality of read alignment to the genome and assembly accuracy. To illustrate the usefulness of SEECER for de novo transcriptome studies, we generated new RNA-Seq data to study the development of the sea cucumber Parastichopus parvimensis. Our corrected assembled transcripts shed new light on two important stages in sea cucumber development. Comparison of the assembled transcript...Continue Reading

References

Jan 27, 1999·Bioinformatics·S R Eddy
Mar 2, 2002·Science·Eric H DavidsonHamid Bolouri
Apr 5, 2002·Genome Research·W James Kent
Jul 4, 2002·International Immunopharmacology·E AlvarezF Orallo
May 9, 2007·Nucleic Acids Research·Andreas UntergasserJack A M Leunissen
Nov 29, 2007·Proceedings of the National Academy of Sciences of the United States of America·Veronica F Hinman, Eric H Davidson
Jan 11, 2008·BMC Bioinformatics·Andreas DöringKnut Reinert
Jun 3, 2008·Nature Methods·Ali MortazaviBarbara Wold
Jul 29, 2008·Nucleic Acids Research·Juliane C DohmHeinz Himmelbauer
Nov 19, 2008·Nature Reviews. Genetics·Zhong WangMichael Snyder
Mar 6, 2009·Genome Biology·Ben LangmeadSteven L Salzberg
Mar 18, 2009·Bioinformatics·Cole TrapnellSteven L Salzberg
Apr 18, 2009·Biology Direct·Alicia Oshlack, Matthew J Wakefield
Jun 23, 2009·Bioinformatics·Jan SchröderBertil Schmidt
Sep 1, 2009·Bioinformatics·Gabriel F BerrizFrederick P Roth
Dec 17, 2009·BMC Bioinformatics·Christiam CamachoThomas L Madden
Feb 13, 2010·Nucleic Acids Research·Hugues RichardMarie-Laure Yaspo
Apr 17, 2010·Nucleic Acids Research·Kasper D HansenSandrine Dudoit
Apr 29, 2010·Genome Biology·Bolan Linghu, Charles DeLisi
May 13, 2010·Genome Biology·Jun LiWing Hung Wong
Sep 14, 2010·Bioinformatics·Xiao YangSrinivas Aluru
Oct 12, 2010·Nature Methods·Gordon RobertsonInanc Birol
Nov 3, 2010·Nucleic Acids Research·Scott F SacconeJohn P Rice
Nov 6, 2010·Nucleic Acids Research·UniProt Consortium
Nov 26, 2010·Nucleic Acids Research·Tanya BarrettAlexandra Soboleva
Dec 1, 2010·Genome Biology·David R KelleySteven L Salzberg
Dec 1, 2010·Bioinformatics·Lucian IlieSilvana Ilie
Jan 5, 2011·Nucleic Acids Research·Michael Y Galperin, Guy R Cochrane
Jan 11, 2011·Bioinformatics·Guillaume Marçais, Carl Kingsford
Mar 18, 2011·Genome Biology·Adam RobertsLior Pachter
Apr 8, 2011·Bioinformatics·Leena Salmela, Jan Schröder
Apr 13, 2011·Genome Research·Wei-Chun KaoYun S Song
May 17, 2011·Nature Biotechnology·Manfred G GrabherrAviv Regev
Jun 21, 2011·Bioinformatics·Paul MedvedevPavel Pevzner
Aug 4, 2011·Bioinformatics·Ergude BaoThomas Girke
Nov 1, 2011·PloS One·Linnéa Smeds, Axel Künstner
Dec 7, 2011·Nucleic Acids Research·Michael Y Galperin, Xosé M Fernández-Suárez
Dec 20, 2011·BMC Bioinformatics·Davide RissoSandrine Dudoit
Apr 12, 2012·Briefings in Bioinformatics·Xiao YangSrinivas Aluru
Nov 20, 2012·Nature Methods·Adam Roberts, Lior Pachter
Mar 14, 2013·The Clinical Teacher·Steve Trumble

Citations

Jun 17, 2014·Bioinformatics·Zhaojun Zhang, Wei Wang
Jul 31, 2014·Bioinformatics·Eun-Cheon LimDetlef Weigel
Jan 21, 2016·Expert Opinion on Drug Discovery·Gabriel Wajnberg, Fabio Passetti
May 31, 2015·Briefings in Bioinformatics·David LaehnemannAlice C McHardy
Apr 1, 2014·Scientific Reports·Maria S PoptsovaSergei L Grokhovsky
Jan 30, 2014·Molecular Ecology·Xiaofan Zhou, Antonis Rokas
Nov 25, 2015·Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences·Y WengerB Galliot
May 27, 2015·Nucleic Acids Research·Min-Te ChouJui-Hung Hung
Sep 25, 2015·Genome Biology·Xin HeZiv Bar-Joseph
May 7, 2016·Bioinformatics·Dilip A Durai, Marcel H Schulz
Sep 23, 2017·Development·Sabrina Z JanAns M M van Pelt
Aug 5, 2020·Molecular Biology and Evolution·Joel VizuetaAlejandro Sánchez-Gracia
May 29, 2018·International Journal of Genomics·Qing ChenYan Wang
Jun 7, 2017·Proceedings of the National Academy of Sciences of the United States of America·Jeffrey R ThompsonDavid J Bottjer
Mar 28, 2019·Scientific Reports·Dilip A Durai, Marcel H Schulz
Sep 11, 2020·Bioinformatics·Lucile BroseusWilliam Ritchie
Jun 19, 2018·Bioinformatics·Dilip A Durai, Marcel H Schulz
Mar 18, 2021·BMC Ecology and Evolution·Jennifer L SpillaneDavid C Plachetzki

Related Concepts

Related Feeds

Alternative splicing

Alternative splicing a regulated gene expression process that allows a single genetic sequence to code for multiple proteins. Here is that latest research.