Jul 14, 2014

Benchmark Analysis of Algorithms for Determining and Quantifying Full-length mRNA Splice Forms from RNA-Seq Data

BioRxiv : the Preprint Server for Biology
Katharina HayerGregory R Grant

Abstract

The advantages of RNA sequencing (RNA-Seq) suggest it will replace microarrays for highly parallel gene expression analysis. For example, in contrast to arrays, RNA-Seq is expected to be able to provide accurate identification and quantification of full-length transcripts. A number of methods have been developed for this purpose, but short error prone reads makes it a difficult problem in practice. It is essential to determine which algorithms perform best, and where and why they fail. However, there is a dearth of independent and unbiased benchmarking studies of these algorithms. Here we take an approach using both simulated and experimental benchmark data to evaluate their accuracy. We conclude that most methods are inaccurate even using idealized data, and that no is method sufficiently accurate once complicating factors such as polymorphisms, intron signal, sequencing error, and multiple splice forms are present. These results point to the pressing need for further algorithm development.

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Gene Polymorphism
Study
Sequence Determinations, RNA
Splice Variants, Protein
Nucleic Acid Sequencing
Gene Expression
Sequencing
Simulation
Introns
Analysis

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.