BUTTERFLY: addressing the pooled amplification paradox with unique molecular identifiers in single-cell RNA-seq.

Genome Biology
Johan GustafssonLior Pachter

Abstract

The incorporation of unique molecular identifiers (UMIs) in single-cell RNA-seq assays makes possible the identification of duplicated molecules, thereby facilitating the counting of distinct molecules from sequenced reads. However, we show that the naïve removal of duplicates can lead to a bias due to a "pooled amplification paradox," and we propose an improved quantification method based on unseen species modeling. Our correction called BUTTERFLY uses a zero truncated negative binomial estimator implemented in the kallisto bustools workflow. We demonstrate its efficacy across cell types and genes and show that in some cases it can invert the relative abundance of genes.

References

Apr 18, 2009·Biology Direct·Alicia Oshlack, Matthew J Wakefield
Mar 19, 2011·BMC Bioinformatics·Xavier RobinMarkus Müller
Nov 22, 2011·Nature Methods·Teemu KiviojaJussi Taipale
Feb 11, 2012·Nucleic Acids Research·Yuval Benjamini, Terence P Speed
Feb 26, 2013·Nature Methods·Timothy Daley, Andrew D Smith
Aug 21, 2013·PLoS Computational Biology·Michael LawrenceVincent J Carey
Apr 5, 2016·Nature Biotechnology·Nicolas L BrayLior Pachter
Jun 3, 2016·Quantitative Biology·Chao DengAndrew D Smith
Nov 11, 2016·Proceedings of the National Academy of Sciences of the United States of America·Alon OrlitskyYihong Wu
Feb 19, 2017·Molecular Cell·Christoph ZiegenhainWolfgang Enard
Nov 17, 2017·Nature·Adam L HaberAviv Regev
Mar 2, 2018·Nature Protocols·Valentine SvenssonSarah A Teichmann
May 31, 2018·GigaScience·Swati ParekhInes Hellmann
Jun 11, 2019·Cell·Tim StuartRahul Satija
Nov 20, 2019·Nature Methods·Ilya KorsunskySoumya Raychaudhuri
Jan 18, 2020·Genome Biology·Hoa Thi Nhu TranJinmiao Chen
Apr 3, 2021·Nature Biotechnology·Páll MelstedLior Pachter

❮ Previous
Next ❯

Software Mentioned

LBFGSpp
UCSC
ENSEMBL
Next Gem
GenomicFeatures and Biostrings
PreseqR
seq
kallisto bustools
R package BSgenome .
10X Genomics

Related Concepts

Related Feeds

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Related Papers

The Journal of Neuroscience : the Official Journal of the Society for Neuroscience
John H R Maunsell
BioRxiv : the Preprint Server for Biology
Páll MelstedLior Pachter
© 2021 Meta ULC. All rights reserved