How Pairwise Coevolutionary Models Capture the Collective Residue Variability in Proteins?

Molecular Biology and Evolution
Matteo FigliuzziMartin Weigt

Abstract

Global coevolutionary models of homologous protein families, as constructed by direct coupling analysis (DCA), have recently gained popularity in particular due to their capacity to accurately predict residue-residue contacts from sequence information alone, and thereby to facilitate tertiary and quaternary protein structure prediction. More recently, they have also been used to predict fitness effects of amino-acid substitutions in proteins, and to predict evolutionary conserved protein-protein interactions. These models are based on two currently unjustified hypotheses: 1) correlations in the amino-acid usage of different positions are resulting collectively from networks of direct couplings; and 2) pairwise couplings are sufficient to capture the amino-acid variability. Here, we propose a highly precise inference scheme based on Boltzmann-machine learning, which allows us to systematically address these hypotheses. We show how correlations are built up in a highly collective way by a large number of coupling paths, which are based on the proteins three-dimensional structure. We further find that pairwise coevolutionary models capture the collective residue variability across homologous proteins even for quantities which are ...Continue Reading

References

Nov 6, 2018·PLoS Computational Biology·Susann VorbergJohannes Söding
Jul 23, 2019·PeerJ·Adam J Hockenberry, Claus O Wilke
Oct 24, 2019·Physical Review. E·Kai Shimagaki, Martin Weigt
Mar 4, 2020·Proceedings of the National Academy of Sciences of the United States of America·Jose Alberto de la PazFaruck Morcos
Feb 28, 2019·Proceedings of the National Academy of Sciences of the United States of America·Xavier MeyerNicolas Salamin
Jul 26, 2018·Scientific Reports·Maher M KassemKresten Lindorff-Larsen
Jan 2, 2019·Molecules : a Journal of Synthetic Chemistry and Natural Product Chemistry·Patrice KoehlMarc Delarue
Dec 8, 2019·Applied and Environmental Microbiology·Tsvetelina H BaryakovaBenjamin J Hackel
Aug 10, 2020·Molecular Biology and Evolution·Jorge Fernandez-de-Cossio-DiazAndrea Pagnani

Citations

Jan 27, 1999·Bioinformatics·Sean R Eddy
Sep 24, 2005·Nature·Michael SocolichRama Ranganathan
Sep 24, 2005·Nature·William P RussRama Ranganathan
Nov 23, 2005·Bioinformatics·Konstantin ArnoldTorsten Schwede
Jan 1, 2009·Proceedings of the National Academy of Sciences of the United States of America·Martin WeigtTerence Hwa
Dec 19, 2009·Proceedings of the National Academy of Sciences of the United States of America·Alexander SchugHendrik Szurmant
Jan 27, 2011·Proteins·Sivaraman BalakrishnanChristopher James Langmead
Nov 23, 2011·Proceedings of the National Academy of Sciences of the United States of America·Faruck MorcosMartin Weigt
Nov 10, 2012·Nature Biotechnology·Debora S MarksChris Sander
Feb 16, 2013·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Magnus EkebergErik Aurell
Feb 21, 2013·Methods in Enzymology·Kimberly A ReynoldsRama Ranganathan
Mar 6, 2013·Nature Reviews. Genetics·David de JuanAlfonso Valencia
Nov 30, 2013·Nucleic Acids Research·Robert D FinnMarco Punta
Feb 28, 2014·Methods in Molecular Biology·Benjamin Webb, Andrej Sali
May 21, 2014·Proceedings of the National Academy of Sciences of the United States of America·Jakub Otwinowski, Joshua B Plotkin
Oct 22, 2015·Proceedings of the National Academy of Sciences of the United States of America·Ludovico SuttoFrancesco Luigi Gervasio
Jun 1, 2016·Protein Science : a Publication of the Protein Society·Allan HaldaneRonald M Levy
Nov 22, 2016·Current Opinion in Structural Biology·Ronald M LevyWilliam F Flynn
Jan 6, 2017·PLoS Computational Biology·Sheng WangJinbo Xu
Jan 21, 2017·Science·Sergey OvchinnikovDavid Baker
Nov 5, 2017·Current Opinion in Structural Biology·Hendrik Szurmant, Martin Weigt
Jan 20, 2018·Physical Review. E·Michael Schmidt, Kay Hamacher

Related Concepts

Metal Working Fluid
Biological Coevolution
Genes, Reiterated
Gene Products, Protein
Homologous Sequences, Amino Acid
Structure
Learning
Analysis
Homologous Gene

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Sexual Dimorphism in Neurodegeneration

There exist sex differences in neurodevelopmental and neurodegenerative disorders. For instance, multiple sclerosis is more common in women, whereas Parkinson’s disease is more common in men. Here is the latest research on sexual dimorphism in neurodegeneration

HLA Genetic Variation

HLA genetic variation has been found to confer risk for a wide variety of diseases. Identifying these associations and understanding their molecular mechanisms is ongoing and holds promise for the development of therapeutics. Find the latest research on HLA genetic variation here.

Super-resolution Microscopy

Super-resolution microscopy is the term commonly given to fluorescence microscopy techniques with resolutions that are not limited by the diffraction of light. Here are the latest discoveries pertaining to super-resolution microscopy.

Genetic Screens in iPSC-derived Brain Cells

Genetic screening is a critical tool that can be employed to define and understand gene function and interaction. This feed focuses on genetic screens conducted using induced pluripotent stem cell (iPSC)-derived brain cells.

Brain Lower Grade Glioma

Low grade gliomas in the brain form from oligodendrocytes and astrocytes and are the slowest-growing glioma in adults. Discover the latest research on these brain tumors here.

CD4/CD8 Signaling

Cluster of differentiation 4 and 8 (CD8 and CD8) are glycoproteins founds on the surface of immune cells. Here is the latest research on their role in cell signaling pathways.

Alignment-free Sequence Analysis Tools

Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.