Apr 16, 2020

Classifying protein structures into folds by convolutional neural networks, distance maps, and persistent homology

BioRxiv : the Preprint Server for Biology
Y. HongJianlin Cheng

Abstract

The fold classification of a protein reveals valuable information about its shape and function. It is important to find a mapping between protein structures and their folds. There are numerous machine learning techniques to predict protein folds from 1-dimensional (1D) protein sequences, but there are few machine learning methods to directly class protein 3D (tertiary) structures into predefined folds (e.g. folds defined in the SCOP database). We develop a 2D-convolutional neural network to classify any protein structure into one of 1232 folds. We extract two classes of input features for each protein: residue-residue distance matrix and persistent homology images derived from 3D protein structures. Due to restrictions in computing resources, we sample every other point in the carbon alpha chain to generate a reduced distance map representation. We find that it does not lead to significant loss in accuracy. Using the distance matrix, we achieve an accuracy of 95.2% on the SCOP dataset. With persistence homology images of 100 x 100 resolution, we achieve an accuracy of 56% on SCOPe 2.07 dataset. Combining the two kinds of features further improves classification accuracy. The source code of our method (PRO3DCNN) is available at ...Continue Reading

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Extracellular
Cell Motility
Regulation of Biological Process
Glycolipids
Motility
Bacterial-type Flagellar Swarming Motility
Gene Expression
Quorum Sensing Response Regulator Activity
Monitoring - Action
Extracellular Signal Regulated Kinases

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.