PMID: 12016053May 23, 2002

Binary analysis and optimization-based normalization of gene expression data

Ilya Shmulevich, Wei Zhang


Most approaches to gene expression analysis use real-valued expression data, produced by high-throughput screening technologies, such as microarrays. Often, some measure of similarity must be computed in order to extract meaningful information from the observed data. The choice of this similarity measure frequently has a profound effect on the results of the analysis, yet no standards exist to guide the researcher. To address this issue, we propose to analyse gene expression data entirely in the binary domain. The natural measure of similarity becomes the Hamming distance and reflects the notion of similarity used by biologists. We also develop a novel data-dependent optimization-based method, based on Genetic Algorithms (GAs), for normalizing gene expression data. This is a necessary step before quantizing gene expression data into the binary domain and generally, for comparing data between different arrays. We then present an algorithm for binarizing gene expression data and illustrate the use of the above methods on two different sets of data. Using Multidimensional Scaling, we show that a reasonable degree of separation between different tumor types in each data set can be achieved by working solely in the binary domain. Th...Continue Reading


Oct 20, 2005·Journal of Bioscience and Bioengineering·Kazumi HakamadaTakeshi Kobayashi
Nov 9, 2007·Nature Biotechnology·Jasmin Fisher, Thomas A Henzinger
Apr 10, 2004·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Xiaobo ZhouEdward Suh
Mar 10, 2011·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Salim A ChowdhuryMehmet Koyutürk
Apr 30, 2009·Bioinformatics·Janis Dingel, Olgica Milenkovic
Apr 28, 2010·Bioinformatics·Noushin GhaffariEdward R Dougherty
Oct 20, 2010·Bioinformatics·Xiaoning QianEdward R Dougherty
May 23, 2007·Nucleic Acids Research·Debashis SahooSylvia K Plevritis
Aug 4, 2004·Journal of Bioinformatics and Computational Biology·Tero AittokallioRiitta Lahesmaa
Feb 7, 2008·EURASIP Journal on Bioinformatics & Systems Biology·Shu-Qin ZhangMichael K Ng
Jan 1, 2009·EURASIP Journal on Bioinformatics & Systems Biology·Youting SunEdward R Dougherty
Oct 5, 2010·EURASIP Journal on Bioinformatics & Systems Biology·Koichi KobayashiKunihiko Hiraishi
Jul 23, 2009·BMC Genomics·Raphael D IsokpehiHari H P Cohly
Jan 9, 2010·BMC Genomics·Iti Chaturvedi, Jagath C Rajapakse
Jul 3, 2013·Cell Communication and Signaling : CCS·Panuwat TrairatphisanThomas Sauter
Sep 7, 2013·EURASIP Journal on Bioinformatics & Systems Biology·Guy Karlebach
Jun 1, 2006·Biology Direct·Galina GlazkoArcady Mushegian
Aug 10, 2010·Biology Direct·Galina Glazko, Arcady Mushegian
Nov 1, 2008·Genome Biology·Debashis SahooSylvia K Plevritis
Nov 14, 2013·PloS One·Gustavo GlusmanLeroy Hood
Sep 13, 2005·Proceedings of the National Academy of Sciences of the United States of America·Ilya ShmulevichMaximino Aldana
Aug 10, 2011·Bioinformatics·Domingo S Rodriguez-BaenaJesus S Aguilar-Ruiz
Oct 16, 2015·Bioinformatics·Christoph MüsselHans A Kestler
Sep 12, 2015·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Ting Chen, Ulisses M Braga-Neto
Oct 13, 2005·The FEBS Journal·Peter M BowersDavid Eisenberg
Sep 14, 2010·Wiley Interdisciplinary Reviews. Systems Biology and Medicine·Mehmet Koyutürk
Jan 5, 2014·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Ting Chen, Ulisses M Braga-Neto
Apr 6, 2011·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Martin HopfensitzHans A Kestler
Aug 2, 2007·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Andy M YipTony F Chan
Apr 25, 2012·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Hannes KlarnerAlexander Bockmayr
Jan 27, 2015·Bioinformatics·Aurélien NaldiClaudine Chaouiya
Apr 28, 2006·Chemistry & Biology·Yongmun ChoiMotonari Uesugi
Oct 2, 2015·BMC Bioinformatics·Roozbeh DehghannasiriEdward R Dougherty
Mar 23, 2006·Technology in Cancer Research & Treatment·Matti NykterWei Zhang
Jun 10, 2006·Briefings in Bioinformatics·Pedro LarrañagaVictor Robles
Feb 11, 2005·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Andres FigueroaTao Jiang
Jun 29, 2018·Bioinformatics·Sumit MukherjeeSreeram Kannan

Related Concepts

Mixed Gliomas
Leiomyosarcoma, Myxoid
Pattern Recognition System
Two-Parameter Models
Disease Clustering
Spearman Rank Correlation Coefficient
Cdna Microarrays
MRNA Differential Display

Trending Feeds


Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Computational Methods for Protein Structures

Computational methods employing machine learning algorithms are powerful tools that can be used to predict the effect of mutations on protein structure. This is important in neurodegenerative disorders, where some mutations can cause the formation of toxic protein aggregations. This feed follows the latests insights into the relationships between mutation and protein structure leading to better understanding of disease.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.

Related Papers

Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
Jutta GebertRainer Schrader
Journal of Theoretical Biology
Xiaoning Qian, Edward R Dougherty
© 2021 Meta ULC. All rights reserved