Strong feature sets from small samples

Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
Seungchan KimJeffrey M Trent

Abstract

For small samples, classifier design algorithms typically suffer from overfitting. Given a set of features, a classifier must be designed and its error estimated. For small samples, an error estimator may be unbiased but, owing to a large variance, often give very optimistic estimates. This paper proposes mitigating the small-sample problem by designing classifiers from a probability distribution resulting from spreading the mass of the sample points to make classification more difficult, while maintaining sample geometry. The algorithm is parameterized by the variance of the spreading distribution. By increasing the spread, the algorithm finds gene sets whose classification accuracy remains strong relative to greater spreading of the sample. The error gives a measure of the strength of the feature set as a function of the spread. The algorithm yields feature sets that can distinguish the two classes, not only for the sample data, but for distributions spread beyond the sample data. For linear classifiers, the topic of the present paper, the classifiers are derived analytically from the model, thereby providing an enormous savings in computation time. The algorithm is applied to cancer classification via cDNA microarrays. In pa...Continue Reading

References

Apr 15, 1992·Proceedings of the National Academy of Sciences of the United States of America·J S MarkenA Aruffo
Sep 1, 1986·Proceedings of the National Academy of Sciences of the United States of America·I HellströmK E Hellström
May 1, 1994·International Journal of Cancer. Journal International Du Cancer·J BartkovaJ Bartek
Jun 13, 1997·The Journal of Biological Chemistry·E Hedblom, E F Kirkness
Jan 23, 1999·Nature Genetics·D J DugganJ M Trent
Jun 9, 1999·Proceedings of the National Academy of Sciences of the United States of America·U AlonA J Levine
Aug 30, 2000·Nature·C M PerouD Botstein
Dec 7, 2000·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·A Ben-DorZ Yakhini
Feb 24, 2001·The New England Journal of Medicine·I HedenfalkG Sauter
Jul 17, 2008·Comparative and Functional Genomics·E R Dougherty

❮ Previous
Next ❯

Citations

May 18, 2004·Computers in Biology and Medicine·Junior BarreraMarco D Gubitoso
Jul 2, 2003·Current Opinion in Structural Biology·Gustavo Stolovitzky
Aug 9, 2003·DNA and Cell Biology·David A Morrison, John T Ellis
Feb 26, 2004·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Mark R SegalBruce R Conklin
Jan 22, 2005·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·C BhattacharyyaI S Mian
Nov 6, 2004·Applied and Environmental Microbiology·A V PalumboC C Brandt
Jul 28, 2005·Journal of Biomedicine & Biotechnology·Yong MaoStephen T C Wong
Jan 3, 2009·BMC Bioinformatics·Shuangge Ma, Jian Huang
Jun 2, 2006·BMC Bioinformatics·Jianping HuaEdward R Dougherty
Oct 7, 2006·BMC Bioinformatics·Prabakaran SubramaniShekhar Verma
May 24, 2007·BMC Bioinformatics·Junior BarreraHelena Brentani
Sep 28, 2005·Journal of Zhejiang University. Science. B·Yong MaoStephen T C Wong
Jan 23, 2004·Cancer Investigation·Jun LuoDavid J Duggan
Dec 23, 2015·Journal of Biomedical Informatics·Iman KamkarSvetha Venkatesh
Aug 18, 2012·Magnetic Resonance Imaging·Virendra KumarRobert J Gillies
Aug 13, 2010·Computational Biology and Chemistry·Zengyou He, Weichuan Yu
Aug 25, 2009·Best Practice & Research. Clinical Haematology·Richard Simon
Feb 5, 2009·Journal of Statistical Planning and Inference·Richard Simon
Feb 1, 2009·Proteomics. Clinical Applications·Blanca LumbrerasIldefonso Hernández-Aguado
Aug 9, 2011·Annals of Neurology·Lucas RestrepoStephen Albert Johnston
Mar 31, 2011·International Journal of Cancer. Journal International Du Cancer·Shilpi AroraGlen J Weiss
May 1, 2010·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Giora Unger, Benny Chor
Aug 2, 2007·IEEE/ACM Transactions on Computational Biology and Bioinformatics·D Huang, Tommy W S Chow
May 19, 2010·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Santanu GhoraiPranab K Dutta
Aug 1, 2008·Proceedings of the IEEE·Xiaobo Zhou, Stephen T C Wong
Oct 24, 2007·Journal of Biomedical Informatics·Pritha Mahata, Kaushik Mahata
Sep 27, 2003·Expert Review of Molecular Diagnostics·Richard Simon
Jul 23, 2019·Thyroid : Official Journal of the American Thyroid Association·Mateus Camargo Barros-FilhoSilvia Regina Rogatto
Oct 18, 2005·Molecular Cancer Therapeutics·David W Mount, Ritu Pandey
Oct 26, 2005·Clinical Cancer Research : an Official Journal of the American Association for Cancer Research·Maria Aparecida Azevedo Koike FolgueiraMaria Mitzi Brentani

❮ Previous
Next ❯

Related Concepts

Related Feeds

Breast Cancer: BRCA1 & BRCA2

Mutations involving BRCA1, found on chromosome 17, and BRCA2, found on chromosome 13, increase the risk for specific cancers, such as breast cancer. Discover the last research on breast cancer BRCA1 and BRCA2 here.

Related Papers

Comparative and Functional Genomics
Edward R Dougherty
IEEE/ACM Transactions on Computational Biology and Bioinformatics
D Huang, Tommy W S Chow
© 2022 Meta ULC. All rights reserved