Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks

BMC Genomics
Xiaoyong PanHong-Bin Shen

Abstract

RNA regulation is significantly dependent on its binding protein partner, known as the RNA-binding proteins (RBPs). Unfortunately, the binding preferences for most RBPs are still not well characterized. Interdependencies between sequence and secondary structure specificities is challenging for both predicting RBP binding sites and accurate sequence and structure motifs detection. In this study, we propose a deep learning-based method, iDeepS, to simultaneously identify the binding sequence and structure motifs from RNA sequences using convolutional neural networks (CNNs) and a bidirectional long short term memory network (BLSTM). We first perform one-hot encoding for both the sequence and predicted secondary structure, to enable subsequent convolution operations. To reveal the hidden binding knowledge from the observed sequences, the CNNs are applied to learn the abstract features. Considering the close relationship between sequence and predicted structures, we use the BLSTM to capture possible long range dependencies between binding sequence and structure motifs identified by the CNNs. Finally, the learned weighted representations are fed into a classification layer to predict the RBP binding sites. We evaluated iDeepS on veri...Continue Reading

References

Oct 23, 1997·Neural Computation·S Hochreiter, J Schmidhuber
Jun 3, 2004·Genome Research·Gavin E CrooksSteven E Brenner
Dec 17, 2005·Bioinformatics·Peter SteffenRobert Giegerich
Jul 29, 2006·Science·Geoffrey Hinton, R R Salakhutdinov
Sep 22, 2006·Nucleic Acids Research·Michael HillerRolf Backofen
Feb 28, 2007·Genome Biology·Shobhit GuptaWilliam S Noble
May 22, 2009·Nucleic Acids Research·Timothy L BaileyWilliam S Noble
Apr 2, 2010·BMC Bioinformatics·Robert C McLeay, Timothy L Bailey
Nov 15, 2011·Nature Structural & Molecular Biology·Jessica I HoellThomas Tuschl
Jul 13, 2013·Nature·Debashish RayTimothy R Hughes
Aug 27, 2013·Bioinformatics·Federico AgostiniGian Gaetano Tartaglia
Jan 24, 2014·Genome Biology·Daniel MaticzkaRolf Backofen
May 29, 2015·Nature·Yann LeCunGeoffrey Hinton
Jun 5, 2015·Briefings in Bioinformatics·Fabrizio FerrèManuela Helmer-Citterich
Jul 28, 2015·Nature Biotechnology·Babak AlipanahiBrendan J Frey
Aug 25, 2015·Nature Methods·Jian Zhou, Olga G Troyanskaya
Oct 16, 2015·Nucleic Acids Research·Sai ZhangJianyang Zeng
Jun 17, 2016·Bioinformatics·Haoyang ZengDavid K Gifford
Apr 22, 2017·Molecular Cell·Thomas TreiberGunter Meister
Jun 8, 2017·BMC Bioinformatics·Shermin PeiMichelle M Meyer

Citations

Nov 21, 2018·The Journal of Biological Chemistry·Kat S Moore, Peter A C 't Hoen
May 9, 2019·Wiley Interdisciplinary Reviews. RNA·Xiaoyong PanHong-Bin Shen
Jun 20, 2019·Protein and Peptide Letters·Amit Sagar, Bin Xue
Sep 5, 2019·PLoS Computational Biology·Yufeng SuJian Peng
Sep 21, 2019·Protein and Peptide Letters·Cheng ShiHeng Zheng
Aug 22, 2020·Biochemical Society Transactions·Alessio ColantoniElsa Zacco
Dec 1, 2019·Genome Biology·Angel Ruiz-RecheEduardo Eyras
Jun 24, 2020·Interdisciplinary Sciences, Computational Life Sciences·Jinmiao SongQiguo Dai
Aug 19, 2020·Briefings in Bioinformatics·Haitao YangJing Wu
Jun 20, 2020·Nucleic Acids Research·Alexander Gulliver Bjørnholt GrønningBrage Storstein Andresen
Nov 11, 2019·Molecules : a Journal of Synthetic Chemistry and Natural Product Chemistry·Zhengfeng WangFang-Xiang Wu
Oct 31, 2020·Briefings in Bioinformatics·Yuning YangKa-Chun Wong
Oct 3, 2020·Briefings in Bioinformatics·Ying HeDe-Shuang Huang
Dec 5, 2020·Journal of Biomolecular Structure & Dynamics·Zhihua DuVladimir N Uversky
Jul 18, 2020·Communications Biology·Juan XieShiyong Liu
Mar 4, 2020·Computational and Structural Biotechnology Journal·Guishan ZhangXianhua Dai
Aug 11, 2020·Computational and Structural Biotechnology Journal·Saad M KhanDong Xu

Related Concepts

DNA Conformation
Plasma Protein Binding Capacity
RNA
Virtual Systems
Neural Network Simulation
RNA-Binding Proteins
Computational Molecular Biology
RNA Motifs
Classification
Reversal Learning

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.