A loop-counting method for covariate-corrected low-rank biclustering of gene-expression and genome-wide association study data.

PLoS Computational Biology
Aaditya V RanganBipolar Disorders Working Group of the Psychiatric Genomics Consortium

Abstract

A common goal in data-analysis is to sift through a large data-matrix and detect any significant submatrices (i.e., biclusters) that have a low numerical rank. We present a simple algorithm for tackling this biclustering problem. Our algorithm accumulates information about 2-by-2 submatrices (i.e., 'loops') within the data-matrix, and focuses on rows and columns of the data-matrix that participate in an abundance of low-rank loops. We demonstrate, through analysis and numerical-experiments, that this loop-counting method performs well in a variety of scenarios, outperforming simple spectral methods in many situations of interest. Another important feature of our method is that it can easily be modified to account for aspects of experimental design which commonly arise in practice. For example, our algorithm can be modified to correct for controls, categorical- and continuous-covariates, as well as sparsity within the data. We demonstrate these practical features with two examples; the first drawn from gene-expression analysis and the second drawn from a much larger genome-wide-association-study (GWAS).

References

Aug 10, 2002·Bioinformatics·Amos TanayRon Shamir
Apr 3, 2003·Genome Research·Yuval KlugerMark Gerstein
Apr 12, 2003·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Sven BergmannNaama Barkai
Nov 2, 2004·Statistical Methods in Medical Research·Iven Van MechelenPaul De Boeck
May 11, 2005·Proceedings of the National Academy of Sciences of the United States of America·Amos TanayRon Shamir
Oct 20, 2006·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Sara C Madeira, Arlindo L Oliveira
Jan 16, 2007·Science·Brendan J Frey, Delbert Dueck
Aug 7, 2007·IEEE Transactions on Information Technology in Biomedicine : a Publication of the IEEE Engineering in Medicine and Biology Society·Sungroh YoonGiovanni De Micheli
Oct 15, 2008·Human Molecular Genetics·Chao TianMichael F Seldin
Feb 19, 2010·Biometrics·Mihee LeeJ S Marron
Aug 7, 2010·The Annals of Applied Statistics·Ann B LeeKathryn Roeder
Jun 4, 2011·Bioinformatics·Martin SillAnnette Kopp-Schneider
Nov 7, 2013·Bernoulli : Official Journal of the Bernoulli Society for Mathematical Statistics and Probability·Xing Sun, Andrew B Nobel
Dec 18, 2013·Advances in Skin & Wound Care·Siobhan O'Connor, Siobhan Murphy
Oct 4, 2016·Computational Biology and Chemistry·Ya-Xuan WangJun-Liang Shang

❮ Previous
Next ❯

Software Mentioned

Seek
Matlab

Related Concepts

Related Feeds

Cancer Genomics (Keystone)

Cancer genomics approaches employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest research using such technologies in this feed.

Bipolar Disorder

Bipolar disorder is characterized by manic and/or depressive episodes and associated with uncommon shifts in mood, activity levels, and energy. Discover the latest research this illness here.

Related Papers

Journal of Bioinformatics and Computational Biology
Haifeng LiTao Jiang
IEEE Transactions on Information Technology in Biomedicine : a Publication of the IEEE Engineering in Medicine and Biology Society
Kenneth BryanNadia Bolshakova
Methods : a Companion to Methods in Enzymology
Shuhua ChenTao Zeng
© 2022 Meta ULC. All rights reserved