Abstract
Given a set of molecular structure data preclassified into a number of classes, the molecular classification problem is concerned with the discovering of interesting structural patterns in the data so that "unseen" molecules not originally in the dataset can be accurately classified. To tackle the problem, interesting molecular substructures have to be discovered and this is done typically by first representing molecular structures in molecular graphs, and then, using graph-mining algorithms to discover frequently occurring subgraphs in them. These subgraphs are then used to characterize different classes for molecular classification. While such an approach can be very effective, it should be noted that a substructure that occurs frequently in one class may also does occur in another. The discovering of frequent subgraphs for molecular classification may, therefore, not always be the most effective. In this paper, we propose a novel technique called mining interesting substructures in molecular data for classification (MISMOC) that can discover interesting frequent subgraphs not just for the characterization of a molecular class but also for the distinguishing of it from the others. Using a test statistic, MISMOC screens each f...Continue Reading
References
Dec 1, 1992·Protein Science : a Publication of the Protein Society·L HolmG Vriend
Oct 30, 1992·Science·A KallioniemiD Pinkel
Jan 9, 1996·Proceedings of the National Academy of Sciences of the United States of America·R D KingM J Sternberg
Jan 1, 1997·Human Molecular Genetics·M G DunlopB Vogelstein
Aug 10, 1999·Journal of Molecular Biology·L A Mirny, E I Shakhnovich
Mar 22, 2001·Advanced Drug Delivery Reviews·C A LipinskiP J Feeney
Mar 26, 2002·FEBS Letters·Andreas ZanzoniGianni Cesareni
Dec 21, 2004·Nucleic Acids Research·Tanya BarrettRon Edgar
Aug 20, 2005·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Luay NakhlehKatherine St John
Nov 23, 2005·Bioinformatics·Konstantin ArnoldTorsten Schwede
May 24, 2006·Psychoneuroendocrinology·Cheng-Cheng Hsiao
Jan 1, 1997·IEEE Transactions on Neural Networks·A Sperduti, A Starita
Apr 26, 2008·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Patrick C H Ma, Keith C C Chan
Jul 1, 1999·SAR and QSAR in Environmental Research·J Devillers