DOI: 10.1101/504365Dec 21, 2018Paper

Sparse functional data analysis accounts for missing information in single-cell epigenomics

BioRxiv : the Preprint Server for Biology
Pedro MadrigalPantelis Z. Hadjipantelis

Abstract

Single-cell epigenome assays produce sparsely sampled data, leading to coverage pooling across cells to increase resolution. Imputation of missing data using deep learning is available but requires intensive computation, and it has been applied only to DNA methylation obtained by single cell bisulfite sequencing. Here, sparsity in chromatin accessibility obtained by scNMT-seq is addressed using functional data analysis to fit sparsely sampled GpC coverage profiles of individual cells taking into account all the cells of the same cell-type or condition. For that, sparse functional principal component analysis (S-FPCA) is applied, and the principal components are used to estimate chromatin accessibility coverage in individual cells. This methodology can potentially be used with other single-cell assays with missing data such as scBS-seq, scNOME-seq, or scATAC-seq. The R package fdapace is available in CRAN, and R code used in this manuscript can be found at: http://github.com/pmb59/sparseSingleCell.

Related Concepts

Chromatin
Learning
DNA Methylation
Sequence Determinations
Analysis
Computed (Procedure)
Epigenomics

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.