Prediction of protein solvent accessibility using support vector machines

Proteins
Zheng YuanJohn S Mattick

Abstract

A Support Vector Machine learning system has been trained to predict protein solvent accessibility from the primary structure. Different kernel functions and sliding window sizes have been explored to find how they affect the prediction performance. Using a cut-off threshold of 15% that splits the dataset evenly (an equal number of exposed and buried residues), this method was able to achieve a prediction accuracy of 70.1% for single sequence input and 73.9% for multiple alignment sequence input, respectively. The prediction of three and more states of solvent accessibility was also studied and compared with other methods. The prediction accuracies are better than, or comparable to, those obtained by other methods such as neural networks, Bayesian classification, multiple linear regression, and information theory. In addition, our results further suggest that this system may be combined with other prediction methods to achieve more reliable results, and that the Support Vector Machine method is a very useful tool for biological sequence analysis.

References

Oct 20, 1975·Biochimica Et Biophysica Acta·B W Matthews
Jun 20, 1992·Journal of Molecular Biology·X ZhangD L Waltz
Aug 1, 1990·Protein Engineering·S R HolbrookS H Kim
Jan 31, 1998·Journal of Molecular Biology·O LichtargeF E Cohen
Apr 18, 1998·Proceedings of the National Academy of Sciences of the United States of America·R Aurora, G D Rose
Mar 26, 1998·Journal of Molecular Biology·M A AndradeB Rost
Dec 1, 1995·Physical Review. E, Statistical Physics, Plasmas, Fluids, and Related Interdisciplinary Topics·G Deco, B Schürmann
Mar 25, 1999·Bioinformatics·M H Mucchielli-GiorgiP Tufféry
Jun 22, 2000·Journal of Molecular Biology·L A KelleyM J Sternberg
Feb 15, 2001·Proteins·H Naderi-ManeshA A Moosavi Movahedi
Mar 29, 2001·Protein Science : a Publication of the Protein Society·C GeourjonG Deléage
May 2, 2001·Bioinformatics·J R Bock, D A Gough
May 23, 2001·Protein Science : a Publication of the Protein Society·J R Macdonald, W C Johnson
May 30, 2001·Journal of Biomedical Informatics·S DreiseitlM Binder
Dec 14, 2001·Proceedings of the National Academy of Sciences of the United States of America·S RamaswamyT R Golub

Citations

Aug 28, 2003·Mathematical Biosciences·C Z CaiY Z Chen
Dec 23, 2004·Bioinformatics·James R Bradford, David R Westhead
Oct 6, 2005·Bioinformatics·Keun-Joon ParkMakiko Suwa
Jun 7, 2005·Nucleic Acids Research·Huiling Chen, Huan-Xiang Zhou
May 29, 2007·Nucleic Acids Research·Jin-Rui XuZhi-Liang Ji
Jan 13, 2004·Journal of Biomedical Optics·WuMei LinJianan Qu
Jan 5, 2010·BMC Bioinformatics·Leander SchietgatSaso Dzeroski
Nov 15, 2011·BMC Bioinformatics·Xiaoxiao Chi, Jingyu Hou
Oct 4, 2006·BMC Bioinformatics·Jiangning Song, Kevin Burrage
Aug 22, 2007·BMC Bioinformatics·Yungki ParkVolkhard Helms
Jul 13, 2007·BMC Bioinformatics·Blaise GassendSrinivas Devadas
Jan 6, 2009·BMC Bioinformatics·Darby Tien-Hao ChangChih-Peng Wu
Aug 4, 2009·BMC Structural Biology·Bent PetersenClaus Lundegaard
Jul 4, 2008·PloS One·Mridul K KalitaDinesh Gupta
Nov 19, 2008·PloS One·Karl Schmid, Ziheng Yang
Mar 31, 2012·PloS One·Yanjun QiWilliam Stafford Noble
Mar 29, 2014·TheScientificWorldJournal·Shambhu Malleshappa GowderKusum Paul
Jul 18, 2006·Biochemical and Biophysical Research Communications·Yan WangJin Xu
Apr 1, 2008·Expert Opinion on Drug Discovery·Ursula Egner, Roman C Hillig
Jul 28, 2009·Journal of Theoretical Biology·Xi LiJiawei Luo
Sep 25, 2007·Computational Biology and Chemistry·Roghayeh ZareiM Sadeghi
Feb 8, 2005·Proteins·Minh N Nguyen, Jagath C Rajapakse
Jan 13, 2005·Proteins·Zheng YuanRohan D Teasdale
Feb 24, 2004·Journal of Computational Chemistry·Zheng YuanRohan D Teasdale
Mar 5, 2004·Proteins·C Z CaiY Z Chen
Jun 26, 2003·Protein Science : a Publication of the Protein Society·Laurence LinsRobert Brasseur
Jul 25, 2006·Biophysical Chemistry·Y Hemajit SinghShandar Ahmad
Jun 21, 2015·Journal of Theoretical Biology·Sumaiya IqbalMd Tamjidul Hoque
Sep 13, 2013·BioMed Research International·Quan ZouZiyu Lin
Sep 23, 2003·Protein Science : a Publication of the Protein Society·Jennifer A SiepenDavid R Westhead
Oct 6, 2017·Biophysical Journal·Anže Lošdorfer Božič, Rudolf Podgornik

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Hereditary Sensory Autonomic Neuropathy

Hereditary Sensory Autonomic Neuropathies are a group of inherited neurodegenerative disorders characterized clinically by loss of sensation and autonomic dysfunction. Here is the latest research on these neuropathies.

Spatio-Temporal Regulation of DNA Repair

DNA repair is a complex process regulated by several different classes of enzymes, including ligases, endonucleases, and polymerases. This feed focuses on the spatial and temporal regulation that accompanies DNA damage signaling and repair enzymes and processes.

Glut1 Deficiency

Glut1 deficiency, an autosomal dominant, genetic metabolic disorder associated with a deficiency of GLUT1, the protein that transports glucose across the blood brain barrier, is characterized by mental and motor developmental delays and infantile seizures. Follow the latest research on Glut1 deficiency with this feed.

Separation Anxiety

Separation anxiety is a type of anxiety disorder that involves excessive distress and anxiety with separation. This may include separation from places or people to which they have a strong emotional connection with. It often affects children more than adults. Here is the latest research on separation anxiety.

KIF1A Associated Neurological Disorder

KIF1A associated neurological disorder (KAND) is a rare neurodegenerative condition caused by mutations in the KIF1A gene. KAND may present with a wide range and severity of symptoms including stiff or weak leg muscles, low muscle tone, a lack of muscle coordination and balance, and intellectual disability. Find the latest research on KAND here.

Regulation of Vocal-Motor Plasticity

Dopaminergic projections to the basal ganglia and nucleus accumbens shape the learning and plasticity of motivated behaviors across species including the regulation of vocal-motor plasticity and performance in songbirds. Discover the latest research on the regulation of vocal-motor plasticity here.