Using machine learning techniques and genomic/proteomic information from known databases for defining relevant features for PPI classification

Computers in Biology and Medicine
J M UrquizaM Cepero

Abstract

In modern proteomics, prediction of protein-protein interactions (PPIs) is a key research line, as these interactions take part in most essential biological processes. In this paper, a new approach is proposed to PPI data classification based on the extraction of genomic and proteomic information from well-known databases and the incorporation of semantic measures. This approach is carried out through the application of data mining techniques and provides very accurate models with high levels of sensitivity and specificity in the classification of PPIs. The well-known support vector machine paradigm is used to learn the models, which will also return a new confidence score which may help expert researchers to filter out and validate new external PPIs. One of the most-widely analyzed organisms, yeast, will be studied. We processed a very high-confidence dataset by extracting up to 26 specific features obtained from the chosen databases, half of them calculated using two new similarity measures proposed in this paper. Then, by applying a filter-wrapper algorithm for feature selection, we obtained a final set composed of the eight most relevant features for predicting PPIs, which was validated by a ROC analysis. The prediction cap...Continue Reading

References

Dec 11, 1999·Nucleic Acids Research·H M BermanP E Bourne
Apr 3, 2001·Proceedings of the National Academy of Sciences of the United States of America·T ItoY Sakaki
Mar 27, 2002·Genes & Development·Anuj KumarMichael Snyder
Jul 16, 2002·Molecular & Cellular Proteomics : MCP·Charlotte M DeaneDavid Eisenberg
Oct 9, 2002·Genome Research·Minghua DengTing Chen
Jan 10, 2003·Nucleic Acids Research·Brigitte BoeckmannMichel Schneider
Nov 8, 2003·Science·L GiotJ M Rothberg
Dec 19, 2003·Nucleic Acids Research·M A HarrisUNKNOWN Gene Ontology Consortium
Dec 19, 2003·Nucleic Acids Research·Evelyn CamonRolf Apweiler
Feb 12, 2004·Bioinformatics·Ivan IossifovAndrey Rzhetsky
Dec 21, 2004·Nucleic Acids Research·U GüldenerH W Mewes
Dec 21, 2004·Nucleic Acids Research·Amelie SteinPatrick Aloy
Feb 16, 2005·Genome Research·Etienne FormstecherLaurent Daviet
Jun 18, 2005·Bioinformatics·Asa Ben-Hur, William Stafford Noble
Jun 22, 2005·Expert Review of Proteomics·Matteo PellegriniJason M Johnson
Oct 29, 2005·Computers in Biology and Medicine·Arunkumar ChinnasamyWing-Kin Sung
Dec 5, 2006·Nucleic Acids Research·UNKNOWN UniProt Consortium
Feb 6, 2007·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Chengbang HuangJesús A Izaguirre
Nov 28, 2007·Nucleic Acids Research·Robert D FinnAlex Bateman
Nov 29, 2007·Bioinformatics·Ramazan Saeed, Charlotte Deane
Feb 2, 2008·IEEE Transactions on Neural Networks·B J de Kruif, T A de Vries
Feb 14, 2008·IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics : a Publication of the IEEE Systems, Man, and Cybernetics Society·Huiru ZhengDavid H Glass
May 13, 2008·Drug Discovery Today·Yin LiuHongyu Zhao
May 19, 2009·BMC Bioinformatics·Nazar ZakiPiers Campbell
Feb 9, 2010·Computers in Biology and Medicine·Fiona BrowneFrancisco Azuaje
Aug 31, 2010·Bioinformatics·Jiantao YuDavid R Westhead

❮ Previous
Next ❯

Related Concepts

Related Feeds

Adenomatous Polyposis Coli

Adenomatous polyposis coli is a protein encoded by the APC gene and acts as a tumor suppressor. Discover the latest research on adenomatous polyposis coli here.

Bioinformatics in Biomedicine

Bioinformatics in biomedicine incorporates computer science, biology, chemistry, medicine, mathematics and statistics. Discover the latest research on bioinformatics in biomedicine here.

Related Papers

IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics : a Publication of the IEEE Systems, Man, and Cybernetics Society
Huiru ZhengDavid H Glass
© 2021 Meta ULC. All rights reserved