Mar 25, 2020

CATH functional families predict protein functional sites

BioRxiv : the Preprint Server for Biology
Sayoni DasChristine A Orengo

Abstract

MotivationIdentification of functional sites in proteins is essential for functional characterisation, variant interpretation and drug design. Several methods are available for predicting either a generic functional site, or specific types of functional site. Here, we present FunSite, a machine learning predictor that identifies catalytic, ligand-binding and protein-protein interaction functional sites using features derived from protein sequence and structure, and evolutionary data from CATH functional families (FunFams). ResultsFunSites prediction performance was rigorously benchmarked using cross-validation and a holdout dataset. FunSite outperformed all publicly-available functional site prediction methods. We show that conserved residues in FunFams are enriched in functional sites. We found FunSites performance depends greatly on the quality of functional site annotations and the information content of FunFams in the training data. Finally, we analyse which structural and evolutionary features are most predictive for functional sites. AvailabilityThe datasets and prediction models are available on request. Contactc.orengo@ucl.ac.uk Supplementary informationSupplementary data are available at Bioinformatics online.

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Genetic Drift
Size
Agar
Spatial Distribution
Molecular_function
Yeasts
Nutrients
Theoretical Study
Cell Growth
Species

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.