PMID: 15360927Sep 14, 2004Paper

A log likelihood predictor for genomic classification of oral cancer using principle component analysis for feature selection

Studies in Health Technology and Informatics
Mark E WhippleChu Chen

Abstract

DNA microarrays are powerful tools for exploring gene expression and predicting disease state. However, since the number of variables (genes) typically exceeds the number of samples (tissue specimens), many potentially spurious genes may be selected for a predictor function. Principle component analysis (PCA) can greatly reduce the high-dimensional microarray data space while retaining most of the inherent variability. We propose a methodology that uses PCA to identify a predictor vector between two mutually exclusive and collectively exhaustive classes. By projecting the training set upon this vector a distribution of projections can be computed for each class. A log-likelihood ratio is then calculated for class membership. We used this methodology to classify 48 biopsy specimens as either oral squamous cell carcinoma or normal oral mucosa using oligonucleotide microarrays. The system was trained using a set of half the samples, and correctly predicted the membership of the other half. The three most highly positively and three most highly negative predictive genes were all keratins that are known markers of squamous cell carcinoma.

Related Concepts

Related Feeds

Carcinoma, Squamous Cell

Basal cell carcinoma is a form of malignant skin cancer found on the head and neck regions and has low rates of metastasis. Discover the latest research on basal cell carcinoma here.