ROC and confusion analysis of structure comparison methods identify the main causes of divergence from manual protein classification.

BMC Bioinformatics
Vichetra SamPeter J Munson

Abstract

Current classification of protein folds are based, ultimately, on visual inspection of similarities. Previous attempts to use computerized structure comparison methods show only partial agreement with curated databases, but have failed to provide detailed statistical and structural analysis of the causes of these divergences. We construct a map of similarities/dissimilarities among manually defined protein folds, using a score cutoff value determined by means of the Receiver Operating Characteristics curve. It identifies folds which appear to overlap or to be "confused" with each other by two distinct similarity measures. It also identifies folds which appear inhomogeneous in that they contain apparently dissimilar domains, as measured by both similarity measures. At a low (1%) false positive rate, 25 to 38% of domain pairs in the same SCOP folds do not appear similar. Our results suggest either that some of these folds are defined using criteria other than purely structural consideration or that the similarity measures used do not recognize some relevant aspects of structural similarity in certain cases. Specifically, variations of the "common core" of some folds are severe enough to defeat attempts to automatically detect str...Continue Reading

References

Dec 1, 1986·Journal of Comparative Physiology. A, Sensory, Neural, and Behavioral Physiology·M F Bennett
Jan 1, 1981·Advances in Protein Chemistry·J S Richardson
Sep 5, 1993·Journal of Molecular Biology·L Holm, C Sander
Nov 1, 1995·Proteins·T MadejS H Bryant
Jan 1, 1996·Methods in Enzymology·S E BrennerA G Murzin
Jun 1, 1996·Current Opinion in Structural Biology·J F GibratS H Bryant
Jun 1, 1997·Proteins·A V Efimov
Aug 15, 1997·Structure·C A OrengoJ M Thornton
Feb 21, 1998·Nucleic Acids Research·L Holm, C Sander
Apr 1, 1998·Protein Science : a Publication of the Protein Society·M Gerstein, M Levitt
Oct 31, 1998·Protein Engineering·I N Shindyalov, P E Bourne
Mar 14, 2000·Proteins·I N Shindyalov, P E Bourne
Jul 6, 2000·FEBS Letters·F S DominguesM J Sippl
Aug 31, 2000·Protein Engineering·J Jung, B Lee
Jun 19, 2001·Current Opinion in Structural Biology·P Koehl
Oct 17, 2002·Protein Science : a Publication of the Protein Society·Angel R OrtizOsvaldo Olmea
Nov 6, 2002·Journal of Molecular Biology·Andrew HarrisonChristine Orengo
Feb 28, 2003·Proceedings of the National Academy of Sciences of the United States of America·Jingtong HouSung-Hou Kim
Jun 26, 2003·Nucleic Acids Research·Adam Zemla
Aug 22, 2003·Protein Science : a Publication of the Protein Society·Sharon Goldsmith-Fischman, Barry Honig
Sep 23, 2003·Protein Science : a Publication of the Protein Society·Ryan DayValerie Daggett
Dec 19, 2003·Nucleic Acids Research·John-Marc ChandoniaSteven E Brenner
Dec 24, 2003·Protein Science : a Publication of the Protein Society·Jessica Shapiro, Douglas Brutlag
Dec 30, 2003·Proteins·Marian NovotnyGerard J Kleywegt
Feb 24, 2004·Protein Science : a Publication of the Protein Society·Michael L Sierk, William R Pearson
Jun 25, 2004·Protein Science : a Publication of the Protein Society·Yuzhen Ye, Adam Godzik
Dec 18, 2004·Bioinformatics·S Sri Krishna, Nick V Grishin
Dec 21, 2004·Nucleic Acids Research·Aron Marchler-BauerStephen H Bryant
Feb 3, 2005·Nucleic Acids Research·Jae Yang SongJaeChan Park
Feb 11, 2005·Journal of Molecular Biology·Rachel KolodnyMichael Levitt
Feb 12, 2005·Proceedings of the National Academy of Sciences of the United States of America·Jingtong HouSung-Hou Kim
Jun 14, 2006·American Journal of Orthodontics and Dentofacial Orthopedics : Official Publication of the American Association of Orthodontists, Its Constituent Societies, and the American Board of Orthodontics·Laurance Jerrold

❮ Previous
Next ❯

Citations

Jan 20, 2012·Briefings in Functional Genomics·Benjamin Kilian, Andreas Graner
Nov 12, 2010·Bioinformatics·R Dustin SchaefferValerie Daggett
May 12, 2010·Nucleic Acids Research·Liisa Holm, Päivi Rosenström
May 20, 2009·BMC Bioinformatics·Alex StivalaPeter J Stuckey
Jul 3, 2010·BMC Bioinformatics·Pooja Jain, Jonathan D Hirst
Sep 22, 2007·BMC Bioinformatics·Changhoon Kim, Byungkook Lee
Aug 11, 2007·BMC Structural Biology·Elena ZotenkoTeresa M Przytycka
Aug 20, 2008·BMC Research Notes·Svetlana Kirillova, Oliviero Carugo
Apr 6, 2011·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Noah M DanielsMatt Menke
Jan 27, 2010·Journal of Physics. Condensed Matter : an Institute of Physics Journal·Michael I Sadowski, William R Taylor
Oct 14, 2019·Protein Science : a Publication of the Protein Society·Liisa Holm
Feb 3, 2011·Proteins·Chin-Hsien TaiByungkook Lee
Apr 4, 2021·International Journal of Molecular Sciences·M Quadir SiddiquiTrushar R Patel

❮ Previous
Next ❯

Software Mentioned

MSCL
VAST Vector Alignment Search Tool
Matlab Statistics toolbox
ASTRAL
MathType
Biowulf
SHEBA
CATH
DALI
MATLAB

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.

Related Papers

Protein Science : a Publication of the Protein Society
Ryan DayValerie Daggett
Proceedings of the National Academy of Sciences of the United States of America
Peter Rogen, Boris Fain
© 2022 Meta ULC. All rights reserved