MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples

Bioinformatics
Jonas BehrGunnar Rätsch

Abstract

High-throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in the detection of expressed genes and reconstruction of RNA transcripts. However, the extensive dynamic range of gene expression, technical limitations and biases, as well as the observed complexity of the transcriptional landscape, pose profound computational challenges for transcriptome reconstruction. We present the novel framework MITIE (Mixed Integer Transcript IdEntification) for simultaneous transcript reconstruction and quantification. We define a likelihood function based on the negative binomial distribution, use a regularization approach to select a few transcripts collectively explaining the observed read data and show how to find the optimal solution using Mixed Integer Programming. MITIE can (i) take advantage of known transcripts, (ii) reconstruct and quantify transcripts simultaneously in multiple samples, and (iii) resolve the location of multi-mapping reads. It is designed for genome- and assembly-based transcriptome reconstruction. We present an extensive study based on realistic simulated RNA-Seq data. When compared with state-of-the-art approaches, MITIE proves to be significantly more sensitive and overall more accurate. More...Continue Reading

References

Aug 10, 2002·Bioinformatics·Steffen HeberPavel A Pevzner
Jul 12, 2003·Bioinformatics·Hui WangDavid Haussler
Aug 24, 2006·Genome Biology·Jennifer HarrowRoderic Guigó
Feb 27, 2007·PLoS Computational Biology·Gunnar RätschBernhard Schölkopf
Jun 30, 2007·BMC Bioinformatics·Shawn CokusMatteo Pellegrini
Jul 13, 2007·BMC Bioinformatics·Jürgen KleffeBurghardt Wittig
Dec 6, 2007·BMC Bioinformatics·Stephen Winters-Hilt, Sam Merat
Feb 27, 2008·BMC Bioinformatics·Sören SonnenburgGunnar Rätsch
Jun 3, 2008·Nature Methods·Ali MortazaviBarbara Wold
Aug 12, 2008·Bioinformatics·Fabio De BonaGunnar Rätsch
Nov 19, 2008·Nature Reviews. Genetics·Zhong WangMichael Snyder
Dec 18, 2008·Genome Biology·France DenoeudFrançois Artiguenave
Mar 3, 2009·Genome Research·Jared T SimpsonInanc Birol
Mar 18, 2009·Bioinformatics·Cole TrapnellSteven L Salzberg
Jun 19, 2009·Nature·Susan E CelnikermodENCODE Consortium
Jul 1, 2009·Genome Research·Gabriele SchweikertGunnar Rätsch
Jan 30, 2010·Nature·Timothy W Nilsen, Brenton R Graveley
Feb 12, 2010·Bioinformatics·Thomas D Wu, Serban Nacu
Jul 21, 2010·Chemphyschem : a European Journal of Chemical Physics and Physical Chemistry·Jeyavel Velmurugan, Michael V Mirkin
Aug 31, 2010·Nucleic Acids Research·Kai WangJinze Liu
Oct 12, 2010·Nature Methods·Gordon RobertsonInanc Birol
Oct 29, 2010·Genome Biology·Simon Anders, Wolfgang Huber
Nov 9, 2010·Nature Methods·Yarden KatzChristopher B Burge
Dec 15, 2010·Current Protocols in Bioinformatics·Géraldine JeanGunnar Rätsch
Mar 3, 2011·European Journal of Human Genetics : EJHG·Alison J CoffeyAarno Palotie
May 17, 2011·Nature Biotechnology·Manfred GrabherrAviv Regev
Jul 29, 2011·The New England Journal of Medicine·David A RaskoMatthew K Waldor
Sep 29, 2011·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Wei LiTao Jiang
Nov 17, 2011·Nucleic Acids Research·Paul FlicekStephen M J Searle
Jan 12, 2012·PLoS Biology·Robert K BradleyChristopher B Burge
Jun 23, 2012·Genome Research·Simon AndersWolfgang Huber
Sep 8, 2012·Nature·ENCODE Project Consortium
Sep 11, 2012·Nucleic Acids Research·Thasso GriebelMichael Sammeth
Oct 30, 2012·Bioinformatics·Alexander DobinThomas R Gingeras
Dec 4, 2012·Genome Research·Aziz M MezliniMichael Brudno
Apr 16, 2013·Nucleic Acids Research·Philipp DreweGunnar Rätsch

Citations

Feb 3, 2016·Genome Biology·Stefan CanzarGunnar W Klau
Nov 5, 2014·Genome Biology·Lasse MarettyAnders Krogh
Mar 16, 2016·Nucleic Acids Research·Li SongLiliana Florea
Feb 25, 2015·BMC Genomics·Masruba TasnimWei Li
Feb 19, 2015·Nature Biotechnology·Mihaela PerteaSteven L Salzberg
Jun 3, 2014·BMC Bioinformatics·Claudia AngeliniItalia De Feis
Nov 7, 2019·Genome Research·Wei LiJingyi Jessica Li
Apr 29, 2018·Nature Communications·Derek AguiarBarbara E Engelhardt
Nov 4, 2019·Nature Communications·Li SongLiliana Florea
Aug 29, 2019·Quantitative Biology·Wei Li, Jingyi Jessica Li

Related Concepts

Drosophila melanogaster
RNA
Computer Programs and Programming
Transcription, Genetic
Sequence Determinations, RNA
Twitter Messaging
High-Throughput Nucleotide Sequencing
Cocaine
Drosophila
Gene Expression

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

The Tendon Seed Network

Tendons are rich in the extracellular matrix and are abundant throughout the body providing essential roles including structure and mobility. The transcriptome of tendons is being compiled to understand the micro-anatomical functioning of tendons. Discover the latest research pertaining to the Tendon Seed Network here.

Myocardial Stunning

Myocardial stunning is a mechanical dysfunction that persists after reperfusion of previously ischemic tissue in the absence of irreversible damage including myocardial necrosis. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Incretins

Incretins are metabolic hormones that stimulate a decrease in glucose levels in the blood and they have been implicated in glycemic regulation in the remission phase of type 1 diabetes. Here is the latest research.

Chromatin Regulation and Circadian Clocks

The circadian clock plays an important role in regulating transcriptional dynamics through changes in chromatin folding and remodelling. Discover the latest research on Chromatin Regulation and Circadian Clocks here.

Long COVID-19

“Long Covid-19” describes illness in patients who are reporting long-lasting effects of the SARS-CoV-19 infection, often long after they have recovered from acute Covid-19. Ongoing health issues often reported include low exercise tolerance and breathing difficulties, chronic tiredness, and mental health problems such as post-traumatic stress disorder and depression. This feed follows the latest research into Long Covid.

Spatio-Temporal Regulation of DNA Repair

DNA repair is a complex process regulated by several different classes of enzymes, including ligases, endonucleases, and polymerases. This feed focuses on the spatial and temporal regulation that accompanies DNA damage signaling and repair enzymes and processes.