Probabilistic grammatical model for helix-helix contact site classification

Algorithms for Molecular Biology : AMB
Witold DyrkaMalgorzata Kotulska

Abstract

Hidden Markov Models power many state-of-the-art tools in the field of protein bioinformatics. While excelling in their tasks, these methods of protein analysis do not convey directly information on medium- and long-range residue-residue interactions. This requires an expressive power of at least context-free grammars. However, application of more powerful grammar formalisms to protein analysis has been surprisingly limited. In this work, we present a probabilistic grammatical framework for problem-specific protein languages and apply it to classification of transmembrane helix-helix pairs configurations. The core of the model consists of a probabilistic context-free grammar, automatically inferred by a genetic algorithm from only a generic set of expert-based rules and positive training samples. The model was applied to produce sequence based descriptors of four classes of transmembrane helix-helix contact site configurations. The highest performance of the classifiers reached AUCROC of 0.70. The analysis of grammar parse trees revealed the ability of representing structural features of helix-helix contact sites. We demonstrated that our probabilistic context-free framework for analysis of protein sequences outperforms the sta...Continue Reading

References

Oct 1, 1988·International Journal of Peptide and Protein Research·J L FauchèreV Pliska
Sep 1, 1988·Protein Engineering·V BiouJ Garnier
Mar 12, 1984·Nucleic Acids Research·V Brendel, H G Busse
Aug 11, 1995·Journal of Molecular Biology·A AszódiW R Taylor
Nov 25, 1994·Nucleic Acids Research·Y SakakibaraD Haussler
Jun 11, 1994·Nucleic Acids Research·S R Eddy, R Durbin
Apr 1, 1994·Proteins·U GöbelA Valencia
Feb 1, 1996·Protein Science : a Publication of the Protein Society·E G Hutchinson, J M Thornton
Jan 17, 1997·Journal of Molecular Biology·J SkolnickA R Ortiz
Sep 1, 1997·Nucleic Acids Research·S F AltschulD J Lipman
Jan 1, 1997·Folding & Design·M VendruscoloE Domany
Feb 21, 1998·Nucleic Acids Research·E L SonnhammerR Durbin
Dec 10, 1998·Nucleic Acids Research·S KawashimaM Kanehisa
Jan 27, 1999·Bioinformatics·S R Eddy
Mar 5, 1999·Protein Engineering·P Fariselli, R Casadio
Dec 11, 1999·Nucleic Acids Research·H M BermanP E Bourne
Mar 15, 2000·Journal of Molecular Biology·W P Russ, D M Engelman
May 24, 2000·Proceedings of the National Academy of Sciences of the United States of America·M EilersP J Fleming
May 29, 2000·Trends in Genetics : TIG·P RiceA Bleasby
Jan 4, 2001·Computer Methods and Programs in Biomedicine·H A Kestler
Aug 10, 2001·Journal of Computer-aided Molecular Design·R P BywaterG Vriend
Dec 14, 2001·Protein Engineering·P FariselliR Casadio
Jan 16, 2002·Protein Science : a Publication of the Protein Society·Teresa PrzytyckaGeorge D Rose
Apr 20, 2002·Biophysical Journal·Markus EilersSteven O Smith
Sep 17, 2002·Briefings in Bioinformatics·Christian J A SigristPhilipp Bucher
Nov 15, 2002·Nature·David B Searls
Jun 26, 2003·Nucleic Acids Research·Bjarne Knudsen, Jotun Hein
Feb 26, 2004·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Shalini VeerassamyElisabeth R M Tillier
Apr 6, 2004·Methods in Enzymology·Carol A RohlDavid Baker
Apr 13, 2004·Protein Science : a Publication of the Protein Society·Sulin Jiang, Ilya A Vakser
May 6, 2004·Proceedings of the National Academy of Sciences of the United States of America·Yang Zhang, Jeffrey Skolnick
Aug 10, 2004·Biophysical Journal·Wei LiJeffrey Skolnick
Oct 7, 2004·Biophysical Journal·Marina GimpelevBarry Honig
Nov 9, 2004·Bioinformatics·Johannes Söding
Dec 21, 2004·Nucleic Acids Research·Gábor E TusnádyIstván Simon
Jun 9, 2005·Current Opinion in Structural Biology·John Moult
Jun 28, 2005·Nucleic Acids Research·Johannes SödingAndrei N Lupas
Jul 15, 2005·IEEE Transactions on Pattern Analysis and Machine Intelligence·Yasubumi Sakakibara
Sep 17, 2005·Science·Philip BradleyDavid Baker
Dec 24, 2005·Proteins·Vladimir Yarov-YarovoyDavid Baker
Feb 14, 2006·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·David ChiangKen A Dill
Feb 24, 2006·Nature Reviews. Molecular Cell Biology·Mario Gimona

❮ Previous
Next ❯

Citations

Feb 26, 2016·F1000Research·Jason E McDermottStephen R Lindemann

❮ Previous
Next ❯

Methods Mentioned

BETA
interaction prediction

Software Mentioned

ClustalW
HMMER2s
HHsearch
GA
GAlib
HMMER2
PDBTM
BLASTP
HMMER3
BLASTP concat

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.

Related Papers

IEEE Transactions on Pattern Analysis and Machine Intelligence
K S Fu, T L Booth
Journal of Computational Biology : a Journal of Computational Molecular Cell Biology
David ChiangDavid B Searls
© 2021 Meta ULC. All rights reserved