Mismatch string kernels for discriminative protein classification

Bioinformatics
Christina S LeslieWilliam Stafford Noble

Abstract

Classification of proteins sequences into functional and structural families based on sequence homology is a central problem in computational biology. Discriminative supervised machine learning approaches provide good performance, but simplicity and computational efficiency of training and prediction are also important concerns. We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the problem of protein classification and remote homology detection. These kernels measure sequence similarity based on shared occurrences of fixed-length patterns in the data, allowing for mutations between patterns. Thus, the kernels provide a biologically well-motivated way to compare protein sequences without relying on family-based generative models such as hidden Markov models. We compute the kernels efficiently using a mismatch tree data structure, allowing us to calculate the contributions of all patterns occurring in the data in one pass while traversing the tree. When used with an SVM, the kernels enable fast prediction on test sequences. We report experiments on two benchmark SCOP datasets, where we show that the mismatch kernel used with an SVM classifi...Continue Reading

Citations

Aug 27, 2010·Journal of Molecular Evolution·Tae-Kun Seo
Apr 23, 2008·Amino Acids·Xing-Ming ZhaoKazuyuki Aihara
Oct 14, 2011·Amino Acids·Loris NanniAlessandra Lumini
Nov 13, 2008·Systems and Synthetic Biology·Dustin T HollowayCharles DeLisi
Oct 11, 2011·Journal of Chemical Information and Modeling·Benny KneisslAndreas Hildebrandt
Jun 1, 2013·Journal of Chemical Information and Modeling·Akira ShiraishiYasushi Okuno
Feb 12, 2011·Molecular & Cellular Proteomics : MCP·Carsten C MahrenholzSepp Hochreiter
Oct 17, 2007·Journal of Biomolecular Structure & Dynamics·Joydeep MitraValadi K Jayaraman
Apr 14, 2009·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·József Dombi, Attila Kertész-Farkas
May 10, 2011·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Matteo Comin, Davide Verzotto
Jun 20, 2008·Briefings in Bioinformatics·Shuangge Ma, Jian Huang
Aug 4, 2005·Bioinformatics·Rui KuangChristina S Leslie
Nov 17, 2005·Bioinformatics·J J GordonP Timms
Jul 14, 2006·Bioinformatics·Thomas Lingner, Peter Meinicke
May 10, 2007·Bioinformatics·Sepp HochreiterKlaus Obermayer
Oct 9, 2007·Bioinformatics·Bo JiangXuegong Zhang
Aug 5, 2008·Bioinformatics·Laurent Jacob, Jean-Philippe Vert
Apr 25, 2009·Bioinformatics·Sebastian J SchultheissGunnar Rätsch
Sep 11, 2012·Bioinformatics·Tatyana GoldbergBurkhard Rost
Apr 1, 2005·Nucleic Acids Research·Shibin QiuTerran Lane
Aug 31, 2011·Genome Research·Dongwon LeeMichael A Beer
Jan 25, 2007·Journal of Bioinformatics and Computational Biology·Gek-Huey ChuaMasaru Tomita
Dec 17, 2009·BMC Bioinformatics·Ren-Xiang YanZiding Zhang
Dec 25, 2009·BMC Bioinformatics·Jayashree Ramana, Dinesh Gupta
Dec 5, 2009·BMC Bioinformatics·Pavel Kuksa, Vladimir Pavlovic
May 14, 2009·BMC Bioinformatics·Pavel KuksaVladimir Pavlovic
Mar 23, 2010·BMC Bioinformatics·Bobbie-Jo M Webb-RobertsonChristopher S Oehmen
Aug 11, 2010·BMC Bioinformatics·Tina KoestlerIngo Ebersberger
Mar 5, 2010·BMC Bioinformatics·Matteo Comin, Davide Verzotto
Nov 10, 2010·BMC Bioinformatics·Nora C ToussaintGunnar Rätsch
Feb 3, 2011·BMC Bioinformatics·Suyu MeiShuigeng Zhou
May 25, 2012·BMC Bioinformatics·Guangyu CuiKyungsook Han
Mar 19, 2013·BMC Bioinformatics·Sébastien GiguèreJacques Corbeil
Mar 19, 2013·BMC Bioinformatics·Satish M SrinivasanChittibabu Guda
Dec 13, 2005·BMC Bioinformatics·Zhengdeng Lei, Yang Dai
Mar 8, 2006·BMC Bioinformatics·Michael SpitzerGeorg Fuellen
Oct 18, 2006·BMC Bioinformatics·Huzefa Rangwala, George Karypis
Jan 27, 2007·BMC Bioinformatics·Tony HåndstadPål Saetrom
Aug 30, 2007·BMC Bioinformatics·Yuan QiNick V Grishin
Jun 30, 2007·BMC Bioinformatics·Iain MelvinChristina S Leslie
Jun 5, 2008·BMC Bioinformatics·Thomas Lingner, Peter Meinicke
Sep 9, 2008·BMC Bioinformatics·Laurent JacobJean-Philippe Vert
Jan 6, 2009·BMC Bioinformatics·Yuchen YangKuo-Bin Li
Apr 21, 2009·BMC Structural Biology·Gergely CsabaRalf Zimmer
Jun 3, 2008·Biology Direct·Dustin T HollowayCharles DeLisi
Jun 20, 2008·Biology Direct·Dustin T HollowayCharles DeLisi
Aug 9, 2011·BMC Systems Biology·Pablo CarbonellJean-Loup Faulon
Jun 5, 2013·Genome Biology·Kevin Y YipMark Gerstein
Nov 1, 2008·PLoS Computational Biology·Asa Ben-HurGunnar Rätsch
Mar 29, 2014·PLoS Computational Biology·Jianzhu MaJinbo Xu
Jul 18, 2014·PLoS Computational Biology·Mahmoud GhandiMichael A Beer
Sep 9, 2010·PloS One·Omer Sinan SaraçRengul Cetin-Atalay
Dec 15, 2010·PloS One·Adrian SchröderAndreas Zell
Dec 17, 2011·PloS One·Anil RajChristopher H Wiggins
Nov 13, 2012·PloS One·Karin SchwarzbauerSepp Hochreiter
Jul 5, 2013·PloS One·Tobias HampBurkhard Rost
Mar 13, 2010·International Journal of Bioinformatics Research and Applications·Stephen O Opiyo, Etsuko N Moriyama
Jan 24, 2014·Genome Biology·Daniel MaticzkaRolf Backofen
Feb 1, 2014·Computers in Biology and Medicine·Oscar Bedoya, Irene Tischer
Dec 21, 2011·Evolutionary Computation·K-J WonA Prügel-Bennett
Aug 20, 2005·Proceedings of the National Academy of Sciences of the United States of America·Fan LuGrace Wahba
Apr 29, 2014·Journal of Computer-aided Molecular Design·J B BrownDragos Horvath
Jun 3, 2014·BMC Bioinformatics·Luna De Ferrari, John B O Mitchell
Jul 19, 2013·Journal of Mathematical Biology·Mahmoud GhandiMichael A Beer
Nov 14, 2014·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Laurent Noé, Donald E K Martin
Sep 28, 2014·BMC Bioinformatics·Abiel Roche-LimaBrian Fristensky
Jun 19, 2010·Journal of Theoretical Biology·Loris NanniAlessandra Lumini
Nov 7, 2009·Artificial Intelligence in Medicine·Loris Nanni, Alessandra Lumini
Jul 14, 2009·Journal of Theoretical Biology·Zakharia M FrenkelSagi Snir
Jan 13, 2009·Journal of Molecular Biology·Quan LePatrice Koehl
Sep 12, 2015·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Guoxian YuZili Zhang
Aug 30, 2008·Computational Biology and Chemistry·Bobbie-Jo M Webb-RobertsonAnuj R Shah
Dec 18, 2013·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Guoxian YuZhiwen Yu
Jan 13, 2005·Proteins·Betty Yee Man ChengJudith Klein-Seetharaman
Sep 24, 2004·Proteins·Yuna HouChristopher Bystroff
Sep 24, 2004·Proteins·Ori Shachar, Michal Linial
Dec 14, 2007·Proteins·Xing-Ming ZhaoKazuyuki Aihara
May 23, 2008·Journal of Molecular Recognition : JMR·Yasser El-ManzalawyVasant Honavar
Mar 3, 2011·Molecular Systems Biology·Hiroaki YabuuchiYasushi Okuno
Jul 2, 2015·Bioinformatics·Dan Ofer, Michal Linial
May 1, 2014·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Guoxian YuZhiwen Yu
Sep 12, 2015·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Michiel StockWillem Waegeman
May 13, 2015·IEEE Transactions on Nanobioscience· Hong-Liang Dai
Sep 12, 2015·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Anveshi Charuvaka, Huzefa Rangwala
Oct 31, 2009·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Pradeep ChowriappaHilary W Thompson
Oct 20, 2006·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Samuel A DanzigerRichard H Lathrop
Nov 3, 2007·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Phaedra AgiusKristin Bennett
Jul 31, 2010·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Arthur ZimekStefan Kramer
Mar 31, 2007·IEEE Transactions on Nanobioscience·Shibin Qiu, Terran Lane
May 23, 2006·Current Opinion in Structural Biology·Roland L Dunbrack
Apr 1, 2016·Scientific Reports·Rong WangBin Liu
Sep 13, 2005·Neural Networks : the Official Journal of the International Neural Network Society·Xing-Ming ZhaoDe-Shuang Huang
Sep 27, 2005·Neural Networks : the Official Journal of the International Neural Network Society·Barbara HammerAlessandro Sperduti
Jan 23, 2016·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Guoxian YuHailong Zhu
Jun 28, 2011·Computers in Biology and Medicine·Hilmi M MudaRazib M Othman
Jun 15, 2011·Computers in Biology and Medicine·Xuan LiuQiwen Dong
Mar 31, 2015·Bioinformatics·Johannes PalmeUlrich Bodenhofer
Feb 25, 2015·BMC Systems Biology·Guoxian YuMaozu Guo
Dec 24, 2014·Biostatistics·Youyi FongGeorgia D Tomaras
Nov 10, 2010·BMC Bioinformatics·Christian WidmerGunnar Rätsch
Jul 27, 2007·BMC Bioinformatics·Wataru Fujibuchi, Tsuyoshi Kato
Feb 14, 2015·Computers in Biology and Medicine·Oscar Bedoya, Irene Tischer
Jul 17, 2014·TheScientificWorldJournal·Loris NanniSheryl Brahnam
May 26, 2006·BMC Bioinformatics·Gunnar RätschChristin Schäfer
Sep 9, 2008·Neural Networks : the Official Journal of the International Neural Network Society·Alexandra Laflamme-Sanders, Mu Zhu
Mar 5, 2011·BMC Bioinformatics·Ashish V Tendulkar, Pramod P Wangikar
Nov 1, 2008·Genomics, Proteomics & Bioinformatics·Jian-Hua XuQiu-Feng Sun
May 7, 2016·Bioinformatics·Mahmoud GhandiMichael A Beer
Jun 14, 2016·BioMed Research International·Junjie ChenDong Huang
May 18, 2016·Journal of Theoretical Biology·Farman Ali, Maqsood Hayat
May 29, 2015·PLoS Computational Biology·Manu Setty, Christina S Leslie
Sep 25, 2008·PloS One·Yasser El-ManzalawyVasant Honavar
Jul 30, 2016·Network Modeling and Analysis in Health Informatics and Bioinformatics·Abiel Roche-Lima
Jul 23, 2016·Behavior Research Methods·Thomas C KüblerEnkelejda Kasneci
Sep 3, 2016·Bioinformatics·Omer S AlkhnbashiRolf Backofen
Nov 10, 2015·Journal of Molecular Recognition : JMR·Dmitry A KarasevBoris N Sobolev
Oct 23, 2015·Journal of the Royal Society, Interface·Vladimir Gligorijević, Nataša Pržulj
Jun 10, 2017·Proteins·Wajid Arshad AbbasiFayyaz Ul Amir Afsar Minhas
Mar 28, 2018·Bioinformatics·Kevin K YangFrances H Arnold
Jan 9, 2018·Bioinformatics·S M Ashiqul IslamErich J Baker
Oct 5, 2018·PloS One·Benoit PlayeVéronique Stoven
Feb 13, 2019·Bioinformatics·Dexiong ChenJulien Mairal
Jul 8, 2005·Applied Bioinformatics·Nazar M ZakiRosli Illias
Jan 19, 2019·Current Drug Metabolism·Yuhua YaoBo Liao
Jul 17, 2019·Nature Methods·Kevin K YangFrances H Arnold
Dec 28, 2019·Biomolecules·Guillermin Agüero-ChapinAgostinho Antunes
Dec 14, 2017·Journal of Computer-aided Molecular Design·Julian ZauggMikael Bodén
Apr 28, 2016·Proceedings. Mathematical, Physical, and Engineering Sciences·Hitoshi KoyanoTatsuya Akutsu
Nov 18, 2018·BMC Bioinformatics·Wajid Arshad AbbasiFayyaz Ul Amir Afsar Minhas
Jul 19, 2017·BMC Genomics·Laura L ColbranJohn A Capra
May 12, 2020·Frontiers in Genetics·Yingwen ZhaoGuoxian Yu
Oct 7, 2017·BMC Genomics·Gene SherShaojie Zhang
Nov 30, 2018·Molecules : a Journal of Synthetic Chemistry and Natural Product Chemistry·Tao LiCangzhi Jia
Sep 6, 2017·PLoS Computational Biology·Kevin D MurrayNorman Warthmann
Mar 2, 2017·BMC Bioinformatics·Saghi Nojoomi, Patrice Koehl
Mar 15, 2019·BioData Mining·Sebastian Spänig, Dominik Heider
Dec 24, 2018·BMC Bioinformatics·Guifeng TangWen Zhang
Apr 3, 2019·Proceedings of the National Academy of Sciences of the United States of America·Akira ShiraishiHonoo Satake
Jan 15, 2021·Scientific Reports·Maria LittmannBurkhard Rost
Oct 31, 2020·Journal of Bioinformatics and Computational Biology·Gihad N SohsahAli Cakmak
Dec 10, 2020·BioData Mining·Wajid Arshad AbbasiFayyaz Ul Amir Afsar Minhas
Dec 19, 2018·Annals of Statistics·Sayan DasguptaMichael R Kosorok

Related Concepts

PHLPP1 protein, human
Knowledge Representation (Computer)
Nucleolar Proteins
Pattern Recognition System
Protein-Threonine Phosphatase
Determination, Sequence Homology
Homologous Sequences, Amino Acid
Sequence Analysis, Protein

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Alzheimer's Disease: MS4A

Variants within the membrane-spanning 4-domains subfamily A (MS4A) gene cluster have recently been implicated in Alzheimer's disease in genome-wide association studies. Here is the latest research on Alzheimer's disease and MS4A.

Pediculosis pubis

Pediculosis pubis is a disease caused by a parasitic insect known as Pthirus pubis, which infests human pubic hair, as well as other areas with hair including eye lashes. Here is the latest research.

Rh Isoimmunization

Rh isoimmunization is a potentially preventable condition that occasionally is associated with significant perinatal morbidity or mortality. Discover the latest research on Rh Isoimmunization here.

Genetic Screens in iPSC-derived Brain Cells

Genetic screening is a critical tool that can be employed to define and understand gene function and interaction. This feed focuses on genetic screens conducted using induced pluripotent stem cell (iPSC)-derived brain cells. It also follows CRISPR-Cas9 approaches to generating genetic mutants as a means of understanding the effect of genetics on phenotype.

Enzyme Evolution

This feed focuses on molecular models of enzyme evolution and new approaches (such as adaptive laboratory evolution) to metabolic engineering of microorganisms. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Pharmacology of Proteinopathies

This feed focuses on the pharmacology of proteinopathies - diseases in which proteins abnormally aggregate (i.e. Alzheimer’s, Parkinson’s, etc.). Discover the latest research in this field with this feed.

Alignment-free Sequence Analysis Tools

Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. Here is the latest research.

Related Papers

Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
T JaakkolaD Haussler
Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
Li Liao, William Stafford Noble
Journal of Bioinformatics and Computational Biology
Rui KuangChristina S Leslie
© 2021 Meta ULC. All rights reserved