Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein-interaction partners

PloS One
Carlo BaldassiAndrea Pagnani

Abstract

In the course of evolution, proteins show a remarkable conservation of their three-dimensional structure and their biological function, leading to strong evolutionary constraints on the sequence variability between homologous proteins. Our method aims at extracting such constraints from rapidly accumulating sequence data, and thereby at inferring protein structure and function from sequence information alone. Recently, global statistical inference methods (e.g. direct-coupling analysis, sparse inverse covariance estimation) have achieved a breakthrough towards this aim, and their predictions have been successfully implemented into tertiary and quaternary protein structure prediction methods. However, due to the discrete nature of the underlying variable (amino-acids), exact inference requires exponential time in the protein length, and efficient approximations are needed for practical applicability. Here we propose a very efficient multivariate Gaussian modeling approach as a variant of direct-coupling analysis: the discrete amino-acid variables are replaced by continuous Gaussian random variables. The resulting statistical inference problem is efficiently and exactly solvable. We show that the quality of inference is comparabl...Continue Reading

Associated Software

Sep 19, 2014·Riccardo ZecchinaMarco Zamparo

References

Oct 9, 2015·Molecular Biology and Evolution·Matteo FigliuzziMartin Weigt
Mar 15, 2016·Molecular BioSystems·Silvia GrigolonMatteo Marsili
Mar 17, 2016·Proceedings of the National Academy of Sciences of the United States of America·Yun-Min SungOlivier Lichtarge
Aug 19, 2015·Journal of Chemical Information and Modeling·Fei GuoJijun Tang
Oct 17, 2015·Journal of Biomolecular Structure & Dynamics·Lorenzo LiviAlireza Sadeghian
Aug 1, 2015·PLoS Computational Biology·Richard R SteinChris Sander
Sep 25, 2016·Proceedings of the National Academy of Sciences of the United States of America·Anne-Florence BitbolNed S Wingreen
Oct 13, 2016·Proceedings of the National Academy of Sciences of the United States of America·Thomas GueudréAndrea Pagnani
Aug 31, 2014·Drug Discovery Today·Matteo CastelliNicasio Mancini
Jan 20, 2018·Physical Review. E·Michael Schmidt, Kay Hamacher
Nov 10, 2017·Reports on Progress in Physics·Simona CoccoMartin Weigt
Oct 22, 2017·Biochemical Society Transactions·Mehari B Zerihun, Alexander Schug
Nov 6, 2018·PLoS Computational Biology·Susann VorbergJohannes Söding
Nov 14, 2018·PLoS Computational Biology·Anne-Florence Bitbol
Jan 1, 2019·PLoS Computational Biology·Andrew F Neuwald, Stephen F Altschul
Oct 15, 2019·PLoS Computational Biology·Guillaume MarmierAnne-Florence Bitbol
Dec 28, 2018·Bioinformatics·Mirco MichelArne Elofsson
Jun 16, 2018·Nucleic Acids Research·Eloy A ColellCristina Marino-Buslje
Oct 6, 2018·Scientific Reports·Alejandro Clavero-ÁlvarezPierpaolo Bruscolini
Jun 1, 2016·Protein Science : a Publication of the Protein Society·Allan HaldaneRonald M Levy
Aug 11, 2019·Proceedings of the National Academy of Sciences of the United States of America·Jinbo Xu
Apr 14, 2016·PLoS Computational Biology·Lorenzo AstiAndrea Pagnani
Jan 2, 2019·Molecules : a Journal of Synthetic Chemistry and Natural Product Chemistry·Patrice KoehlMarc Delarue
Dec 31, 2018·BMC Bioinformatics·Michael Schmidt, Kay Hamacher
Nov 15, 2016·Physical Review. E·Hugo Jacquin, A Rançon
Oct 24, 2019·Physical Review. E·Kai Shimagaki, Martin Weigt
Jun 18, 2017·Scientific Reports·Xian-Li JiangFaruck Morcos
Feb 20, 2020·Physical Review. E·Francesca RizzatoSimona Cocco
Mar 4, 2020·Proceedings of the National Academy of Sciences of the United States of America·Jose Alberto de la PazFaruck Morcos
Oct 1, 2017·Experimental and Therapeutic Medicine·Yunyun WangHaihong Zhu

Citations

Apr 1, 1994·Proteins·U GöbelA Valencia
Jan 4, 1994·Proceedings of the National Academy of Sciences of the United States of America·E Neher
Jan 27, 1999·Bioinformatics·Sean R Eddy
Dec 11, 1999·Nucleic Acids Research·H M BermanP E Bourne
Aug 31, 2000·Annual Review of Biochemistry·A M StockP N Goudreau
Aug 8, 2001·Journal of Bacteriology·James A Hoch, K I Varughese
Oct 23, 2004·Protein Science : a Publication of the Protein Society·Sergiy O GarbuzynskiyOxana V Galzitskaya
Nov 28, 2007·Nucleic Acids Research·Robert D FinnAlex Bateman
Dec 14, 2007·Annual Review of Genetics·Michael T Laub, Mark Goulian
Feb 16, 2008·Molecular Systems Biology·Lukas Burger, Erik van Nimwegen
Jan 1, 2009·Proceedings of the National Academy of Sciences of the United States of America·Martin WeigtTerence Hwa
Dec 19, 2009·Proceedings of the National Academy of Sciences of the United States of America·Alexander SchugHendrik Szurmant
Jan 7, 2010·PLoS Computational Biology·Lukas Burger, Erik van Nimwegen
Feb 6, 2010·Current Opinion in Microbiology·Hendrik Szurmant, James A Hoch
Mar 10, 2010·Proceedings of the National Academy of Sciences of the United States of America·Thierry MoraCurtis G Callan
Jan 27, 2011·Proteins·Sivaraman BalakrishnanChristopher James Langmead
Oct 18, 2011·Computational Biology and Chemistry·Michael I SadowskiWilliam R Taylor
Nov 16, 2011·BMC Bioinformatics·Janardanan SreekumarAalt D J van Dijk
Nov 22, 2011·Protein Science : a Publication of the Protein Society·William R TaylorMichael I Sadowski
Nov 23, 2011·Proceedings of the National Academy of Sciences of the United States of America·Faruck MorcosMartin Weigt
Dec 1, 2011·Nucleic Acids Research·Marco PuntaRobert D Finn
Dec 14, 2011·PloS One·Debora S MarksChris Sander
May 31, 2012·Proceedings of the National Academy of Sciences of the United States of America·Timothy Nugent, David T Jones
Jun 7, 2012·Proceedings of the National Academy of Sciences of the United States of America·Angel E DagoHendrik Szurmant
Jun 14, 2012·Proceedings of the National Academy of Sciences of the United States of America·Joanna I SułkowskaJosé N Onuchic
Feb 16, 2013·Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics·Magnus EkebergErik Aurell
Mar 6, 2013·Nature Reviews. Genetics·David de JuanAlfonso Valencia
Sep 7, 2013·Proceedings of the National Academy of Sciences of the United States of America·Hetunandan KamisettyDavid Baker

Related Concepts

Biochemical Pathway
Caulobacter crescentus
Protein Family
Caulobacter vibrioides
Histidine
Complex (molecular entity)
Pseudo brand of pseudoephedrine
Tertiary Protein Structure
Protein Phosphorylation
Cross Reactions

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Sexual Dimorphism in Neurodegeneration

There exist sex differences in neurodevelopmental and neurodegenerative disorders. For instance, multiple sclerosis is more common in women, whereas Parkinson’s disease is more common in men. Here is the latest research on sexual dimorphism in neurodegeneration

HLA Genetic Variation

HLA genetic variation has been found to confer risk for a wide variety of diseases. Identifying these associations and understanding their molecular mechanisms is ongoing and holds promise for the development of therapeutics. Find the latest research on HLA genetic variation here.

Super-resolution Microscopy

Super-resolution microscopy is the term commonly given to fluorescence microscopy techniques with resolutions that are not limited by the diffraction of light. Here are the latest discoveries pertaining to super-resolution microscopy.

Genetic Screens in iPSC-derived Brain Cells

Genetic screening is a critical tool that can be employed to define and understand gene function and interaction. This feed focuses on genetic screens conducted using induced pluripotent stem cell (iPSC)-derived brain cells.

Brain Lower Grade Glioma

Low grade gliomas in the brain form from oligodendrocytes and astrocytes and are the slowest-growing glioma in adults. Discover the latest research on these brain tumors here.

CD4/CD8 Signaling

Cluster of differentiation 4 and 8 (CD8 and CD8) are glycoproteins founds on the surface of immune cells. Here is the latest research on their role in cell signaling pathways.

Alignment-free Sequence Analysis Tools

Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Related Papers

Proceedings of the National Academy of Sciences of the United States of America
Hetunandan KamisettyDavid Baker
Current Opinion in Structural Biology
William R TaylorMichael I Sadowski
Proceedings of the National Academy of Sciences of the United States of America
Faruck MorcosJosé N Onuchic
© 2020 Meta ULC. All rights reserved