Estimation of alternative splicing isoform frequencies from RNA-Seq data

Algorithms for Molecular Biology : AMB
Marius NicolaeAlex Zelikovsky

Abstract

Massively parallel whole transcriptome sequencing, commonly referred as RNA-Seq, is quickly becoming the technology of choice for gene expression profiling. However, due to the short read length delivered by current sequencing technologies, estimation of expression levels for alternative splicing gene isoforms remains challenging. In this paper we present a novel expectation-maximization algorithm for inference of isoform- and gene-specific expression levels from RNA-Seq data. Our algorithm, referred to as IsoEM, is based on disambiguating information provided by the distribution of insert sizes generated during sequencing library preparation, and takes advantage of base quality scores, strand and read pairing information when available. The open source Java implementation of IsoEM is freely available at http://dna.engr.uconn.edu/software/IsoEM/. Empirical experiments on both synthetic and real RNA-Seq datasets show that IsoEM has scalable running time and outperforms existing methods of isoform and gene expression level estimation. Simulation experiments confirm previous findings that, for a fixed sequencing cost, using reads longer than 25-36 bases does not necessarily lead to better accuracy for estimating expression levels ...Continue Reading

References

Sep 6, 2005·Science·P CarninciRIKEN Genome Exploration Research Group and Genome Science Group (Genome Network Project Core Group)
Jun 3, 2008·Nature Methods·Ali MortazaviBarbara J Wold
Nov 4, 2008·Nature·Eric T WangChristopher B Burge
Nov 19, 2008·Nature Reviews. Genetics·Zhong WangMichael Snyder
Nov 22, 2008·Science·John EidStephen Turner
Feb 28, 2009·BMC Bioinformatics·Benjamin G JacksonSrinivas Aluru
Feb 27, 2009·Bioinformatics·Hui Jiang, Wing Hung Wong
Mar 6, 2009·Genome Biology·Ben LangmeadSteven L Salzberg
Mar 18, 2009·Bioinformatics·Cole TrapnellSteven L Salzberg
Apr 8, 2009·Nature Nanotechnology·James ClarkeHagan Bayley
Apr 18, 2009·Biology Direct·Alicia Oshlack, Matthew J Wakefield
Jun 17, 2009·Bioinformatics·Inanç BirolSteven J M Jones
Aug 6, 2009·BMC Bioinformatics·Yiyuan SheHui Wang
Sep 19, 2009·Bioinformatics·David HillerWing Hung Wong
Sep 22, 2009·Genome Research·Gary TempleStefan Wiemann
Dec 22, 2009·Bioinformatics·Bo LiColin N Dewey
Feb 13, 2010·Nucleic Acids Research·Hugues RichardMarie-Laure Yaspo
Apr 17, 2010·Nucleic Acids Research·Kasper D HansenSandrine Dudoit
May 14, 2010·BMC Bioinformatics·Brian E Howard, Steffen Heber
Aug 10, 2010·Genome Research·Yann Surget-Groba, Juan I Montoya-Burgos
Aug 28, 2010·Human Molecular Genetics·Chris P Ponting, T Grant Belgard
Sep 14, 2010·Nature Methods·Malachi GriffithMarco A Marra
Mar 10, 2011·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Bogdan PaşaniucEran Halperin
Mar 18, 2011·Genome Biology·Adam RobertsLior Pachter

Citations

Apr 24, 2013·Journal of Applied Genetics·Uma GaurGuisheng Liu
Dec 12, 2012·Nature Biotechnology·Cole TrapnellLior Pachter
Mar 7, 2013·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Yan HuangJinze Liu
Nov 10, 2013·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Stefano BerettaRaffaella Rizzi
Feb 23, 2013·Bioinformatics·Ning LengChristina Kendziorski
Jul 12, 2013·Database : the Journal of Biological Databases and Curation·Valerio BianchiManuela Helmer-Citterich
Dec 6, 2011·Nucleic Acids Research·Juan WangQinghua Cui
Dec 23, 2011·BMC Bioinformatics·Ying WangHuaijun Zhou
May 2, 2012·BMC Bioinformatics·Thomas BonfertCaroline C Friedel
Nov 10, 2011·BMC Genomics·Anna Esteve-CodinaMiguel Pérez-Enciso
Apr 6, 2012·Algorithms for Molecular Biology : AMB·Koji KadotaKentaro Shimizu
Apr 13, 2012·PLoS Computational Biology·Onur SakaryaAsim S Siddiqui
Aug 2, 2014·BMC Genomics·Serghei MangulAlex Zelikovsky
Jun 17, 2014·Bioinformatics·Zhaojun Zhang, Wei Wang
Feb 3, 2016·Genome Biology·Stefan CanzarGunnar W Klau
Mar 2, 2016·Statistical Applications in Genetics and Molecular Biology·Zhixiang LinHongyu Zhao
Mar 5, 2016·Thrombosis and Haemostasis·Lea M BeaulieuJane E Freedman
Aug 9, 2015·Chinese Journal of Cancer·Alexey StupnikovFrank Emmert-Streib
Feb 25, 2015·BMC Bioinformatics·Hong SunYixue Li
Sep 23, 2014·Briefings in Functional Genomics·Francesca Finotello, Barbara Di Camillo
Jun 19, 2016·Bioinformatics·Yuanhua Huang, Guido Sanguinetti
Mar 7, 2017·Nature Methods·Rob PatroCarl Kingsford
Mar 24, 2017·Bioinformatics·L SchaefferLior Pachter
Feb 15, 2018·Bioinformatics·Narayanan RaghupathyGary A Churchill
Nov 25, 2017·Nature Communications·Peng ZhangYiwen Chen
Jul 28, 2018·Epigenetics : Official Journal of the DNA Methylation Society·Jingyue Ellie DuanXiuchun Cindy Tian
Dec 20, 2018·Genome Biology and Evolution·Jingyue Ellie DuanXiuchun Cindy Tian
Nov 30, 2018·G3 : Genes - Genomes - Genetics·Jingyue Ellie DuanXiuchun Cindy Tian
May 11, 2019·Bioinformatics·Páll MelstedLior Pachter
Mar 29, 2019·Nature Communications·Serghei MangulJonathan Flint
Nov 2, 2017·Statistical Applications in Genetics and Molecular Biology·Panagiotis Papastamoulis, Magnus Rattray
Oct 2, 2014·Current Protocols in Human Genetics·Alexander G WilliamsAlisha K Holloway
Jun 14, 2019·Frontiers in Genetics·Jingyue Ellie DuanXiuchun Cindy Tian
Apr 26, 2017·Proceedings of the National Academy of Sciences of the United States of America·Nandini AcharyaPramod K Srivastava
Jul 20, 2018·G3 : Genes - Genomes - Genetics·Jeremy R B NewmanLauren M McIntyre
Jul 28, 2018·Bioinformatics·Ashraful ArefeenTao Jiang
Sep 1, 2018·Nucleic Acids Research·Mehran KarimzadehMichael M Hoffman
Jan 27, 2019·PloS One·Shannon R McCurdyLior Pachter
Jan 23, 2018·Journal of the Royal Statistical Society. Series C, Applied Statistics·Panagiotis Papastamoulis, Magnus Rattray
Jul 9, 2020·Computational and Structural Biotechnology Journal·Gabrielle Deschamps-FrancoeurMichelle S Scott
Feb 13, 2021·NAR Genomics and Bioinformatics·Joël SimoneauMichelle S Scott
Aug 24, 2018·Serghei MangulHarry Yang

Methods Mentioned

BETA
PCR
RNA-Seq
paired
read

Related Concepts

Gene Expression
Genes
RNA
Computer Software
Medical Research Activity
Size
Protein Isoforms
Simulation
Empirical Study
Protein Expression

Related Feeds

Alternative splicing

Alternative splicing a regulated gene expression process that allows a single genetic sequence to code for multiple proteins. Here is that latest research.