A new strategy to reduce allelic bias in RNA-Seq readmapping.

Nucleic Acids Research
Ravi Vijaya SatyaJaques Reifman

Abstract

Accurate estimation of expression levels from RNA-Seq data entails precise mapping of the sequence reads to a reference genome. Because the standard reference genome contains only one allele at any given locus, reads overlapping polymorphic loci that carry a non-reference allele are at least one mismatch away from the reference and, hence, are less likely to be mapped. This bias in read mapping leads to inaccurate estimates of allele-specific expression (ASE). To address this read-mapping bias, we propose the construction of an enhanced reference genome that includes the alternative alleles at known polymorphic loci. We show that mapping to this enhanced reference reduced the read-mapping biases, leading to more reliable estimates of ASE. Experiments on simulated data show that the proposed strategy reduced the number of loci with mapping bias by ≥ 63% when compared with a previous approach that relies on masking the polymorphic loci and by ≥ 18% when compared with the standard approach that uses an unaltered reference. When we applied our strategy to actual RNA-Seq data, we found that it mapped up to 15% more reads than the previous approaches and identified many seemingly incorrect inferences made by them.

References

Aug 17, 2002·Science·Hai YanKenneth W Kinzler
Mar 31, 2004·Trends in Genetics : TIG·Julia C Knight
Oct 29, 2005·Nature·UNKNOWN International HapMap Consortium
Oct 19, 2007·Nature·UNKNOWN International HapMap ConsortiumJohn Stewart
Jan 11, 2008·BMC Bioinformatics·Andreas DöringKnut Reinert
Jun 3, 2008·Nature Methods·Ali MortazaviBarbara Wold
Mar 18, 2009·Bioinformatics·Cole TrapnellSteven L Salzberg
May 20, 2009·Bioinformatics·Heng Li, Richard Durbin
Jun 10, 2009·Bioinformatics·Heng LiUNKNOWN 1000 Genome Project Data Processing Subgroup
Sep 19, 2009·Genome Biology·Korbinian SchneebergerDetlef Weigel
Feb 12, 2010·Bioinformatics·Thomas D Wu, Serban Nacu
Aug 4, 2011·Molecular Systems Biology·Joel RozowskyMark Gerstein

❮ Previous
Next ❯

Citations

Apr 24, 2013·Journal of Applied Genetics·Uma GaurGuisheng Liu
Feb 26, 2013·Nucleic Acids Research·Adaikalavan RamasamyMichael E Weale
Feb 15, 2013·Nucleic Acids Research·Danai FimereliTomasz Konopka
Aug 24, 2013·BMC Genomics·Ryan M SmithWolfgang Sadee
Apr 5, 2013·PLoS Computational Biology·Oscar FranzénStaffan G Svärd
Jun 21, 2014·Database : the Journal of Biological Databases and Curation·Shunping HuangWei Wang
Oct 24, 2014·BMC Genomics·Luis G León-NoveloRita M Graze
Sep 23, 2014·Genome Biology·Nikolaos I PanousisTuuli Lappalainen
Apr 26, 2013·Molecular Ecology Resources·Ram Vinay PandeyChristian Schlötterer
Sep 4, 2014·Molecular Genetics and Genomics : MGG·James D Stone, Helena Storchova
Jul 8, 2016·Nature Communications·Uri WeissbeinNissim Benvenisty
Nov 7, 2014·PLoS Genetics·Zachary H LemmonJohn F Doebley
May 6, 2015·Annual Review of Genomics and Human Genetics·Knut ReinertDirk J Evers
Feb 12, 2017·Bioinformatics·Adam Price, Cynthia Gibas
Apr 23, 2020·Bioinformatics·Charlotte A DarbyBen Langmead
Nov 24, 2017·G3 : Genes - Genomes - Genetics·Luis León-NoveloFabio Marroni
Mar 25, 2017·Genome Biology and Evolution·Brice A J SarverJeffrey M Good
May 16, 2017·F1000Research·Igor DolgalevBen Busby
Oct 16, 2019·Bioinformatics·Thomas Büchler, Enno Ohlebusch
Dec 19, 2018·Genome Biology·Jacob PrittBen Langmead
Aug 11, 2019·Genome Biology·Sara BallouzJesse A Gillis
Mar 13, 2020·Genome Biology·Tom MokveldMarcel Reinders
Jun 13, 2018·Nucleic Acids Research·Verena M LinkChristopher K Glass
Mar 16, 2019·BMC Genomics·Kerem Wainer Katsir, Michal Linial

❮ Previous
Next ❯

Datasets Mentioned

BETA
GSE18156
GM19239
GM19238

Methods Mentioned

BETA
RNA-Seq

Software Mentioned

GSNAP
MAQ
SAMtools
BWA
SeqAn
GenomeMapper

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.