Apr 1, 2020

Demystifying "drop-outs" in single cell UMI data

BioRxiv : the Preprint Server for Biology
T. KimMengjie Chen


Analysis of scRNA-seq data has been challenging particularly because of excessive zeros observed in UMI counts. Prevalent opinions are that many of the detected zeros are "drop-outs" that occur during experiments and that those zeros should be accounted for through procedures such as normalization, variance stabilization, and imputation. Here, we extensively analyze publicly available UMI datasets and challenge the existing scRNA-seq workflows. Our results strongly suggest that resolving cell-type heterogeneity should be the foremost step of the scRNA-seq analysis pipeline because once cell-type heterogeneity is resolved, "drop-outs" disappear. Additionally, we show that the simplest parametric count model, Poisson, is sufficient to fully leverage the biological information contained in the UMI data, thus offering a more optimistic view of the data analysis. However, if the cell-type heterogeneity is not appropriately taken into account, pre-processing such as normalization or imputation becomes inappropriate and can introduce unwanted noise. Inspired by these analyses, we propose a zero inflation test that can select gene features contributing to cell-type heterogeneity. We integrate feature selection and clustering into itera...Continue Reading

  • References
  • Citations


  • We're still populating references for this paper, please check back later.
  • References
  • Citations


  • This paper may not have been cited yet.

Mentioned in this Paper

HIV Infections
Objective (Goal)
Statistical Technique
Laboratory Culture
Virus Activation
Dilution Technique
Human immunodeficiency virus type 1 antibody

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.