DOI: 10.1101/503672Dec 21, 2018Paper

A revisit of RSEM generative model and its EM algorithm for quantifying transcript abundances.

BioRxiv : the Preprint Server for Biology
Hy VuongSon Pham

Abstract

RSEM has been mainly known for its accuracy in transcript abundance quantification. However, its quantification time is extremely high compared to that of recent quantification tools. In this paper, we revised the RSEM's EM algorithm. In particular, we derived accurate M-step updates to eliminate incorrect heuristic updates in RSEM. We also implement some optimizations that reduce the quantification time about a hundred times while still have better accuracy compared to RSEM. In particular, we noticed that different parameters have different convergence rates, therefore we identified and removed early converged parameters to significantly reduce the model complexity in further iterations, and we also use SQUAREM method to further speed up the convergence rate. We implemented these revisions in a packaged named Hera-EM, with source code available at: https://github.com/bioturing/hera/tree/master/hera-EM

Related Concepts

Transcript

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.