Linnorm: improved statistical analysis for single cell RNA-seq expression data

Nucleic Acids Research
Shun H YipJunwen Wang

Abstract

Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy.

References

Nov 19, 2008·Nature Reviews. Genetics·Zhong WangMichael Snyder
Apr 8, 2009·Nature Methods·Fuchou TangM Azim Surani
Mar 4, 2010·Genome Biology·Mark D Robinson, Alicia Oshlack
Oct 2, 2012·Cell·Charles Y LinRichard A Young
Oct 30, 2012·Cell·Jakob LovénRichard A Young
Mar 19, 2013·BMC Bioinformatics·Charlotte Soneson, Mauro Delorenzi
Aug 13, 2013·Nature Structural & Molecular Biology·Liying YanFuchou Tang
Sep 24, 2013·Nature Methods·Philip BrenneckeMarcus G Heisler
Feb 4, 2014·Genome Biology·Charity W LawGordon K Smyth
Apr 4, 2014·Bioinformatics·Anthony M BolgerBjoern Usadel
Apr 23, 2014·Nucleic Acids Research·Xiaobei ZhouMark D Robinson
Jun 14, 2014·Science·Anoop P PatelBradley E Bernstein
Dec 18, 2014·Genome Biology·Michael I LoveSimon Anders
Jan 22, 2015·Nucleic Acids Research·Matthew E RitchieGordon K Smyth
Jan 30, 2015·Nature Reviews. Genetics·Oliver StegleJohn C Marioni
Apr 14, 2015·Nature Biotechnology·Rahul SatijaAviv Regev
Dec 22, 2015·Nucleic Acids Research·Andrew YatesPaul Flicek
Apr 5, 2016·Nature Biotechnology·Nicolas L BrayLior Pachter
Apr 8, 2016·Genome Biology·Rhonda Bacher, Christina Kendziorski
Apr 17, 2016·Genome Biology·Catalina A VallejosJohn C Marioni
Sep 3, 2016·Bioinformatics·William PooleTheo A Knijnenburg
Apr 19, 2017·Nature Methods·Rhonda BacherChristina Kendziorski

Citations

Dec 28, 2017·Interdisciplinary Sciences, Computational Life Sciences·Lijun Tang, Nan Zhou
Oct 19, 2018·Nucleic Acids Research·Syed Murtuza BakerMagnus Rattray
Aug 24, 2018·Briefings in Bioinformatics·Taiyun KimPengyi Yang
Nov 14, 2018·Bioinformatics·Martin Pirkl, Niko Beerenwinkel
Mar 3, 2020·Frontiers in Genetics·Nicholas LytalLingling An
Oct 13, 2019·Nature Communications·Beate ViethInes Hellmann
Feb 26, 2019·Nature Methods·Muthukumar RamanathanPaul A Khavari
Jan 21, 2020·Genome Biology·Koki TsuyuzakiItoshi Nikaido
Nov 8, 2020·Nature Communications·Francisco Avila CobosKatleen De Preter
Dec 15, 2020·Briefings in Bioinformatics·Zilong ZhangQuan Zou

Related Concepts

RNA
Reproducibility of Results
Disease Clustering
Log-Linear Models
Sequence Determinations, RNA
MRNA Differential Display
Biostatistics
Single-Cell Analysis
Statistical Cluster
Temporal Lobe

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Computational Methods for Protein Structures

Computational methods employing machine learning algorithms are powerful tools that can be used to predict the effect of mutations on protein structure. This is important in neurodegenerative disorders, where some mutations can cause the formation of toxic protein aggregations. This feed follows the latests insights into the relationships between mutation and protein structure leading to better understanding of disease.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.