Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory 'grammar' to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ.
Potential etiologic and functional implications of genome-wide association loci for human diseases and traits
GRASP: analysis of genotype-phenotype results from 1390 genome-wide association studies and corresponding open access database
JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles
MusiteDeep: a deep-learning framework for general and kinase-specific phosphorylation site prediction
Deep learning of the regulatory grammar of yeast 5' untranslated regions from 500,000 random sequences
Predicting RNA-protein binding sites and motifs through combining local and global deep convolutional neural networks
Prediction of gene regulatory enhancers across species reveals evolutionarily conserved sequence properties
An equivariant Bayesian convolutional network predicts recombination hotspots and accurately resolves binding motifs
Deciphering regulatory DNA sequences and noncoding genetic variants using neural network models of massively parallel reporter assays
Cell Identity Codes: Understanding Cell Identity from Gene Expression Profiles using Deep Neural Networks
Modeling in-vivo protein-DNA binding by combining multiple-instance learning with a hybrid deep neural network
Predicting gene regulatory regions with a convolutional neural network for processing double-strand genome sequence information
Antigenicity prediction and vaccine recommendation of human influenza virus A (H3N2) using convolutional neural networks
cnnAlpha: Protein disordered regions prediction by reduced amino acid alphabets and convolutional neural networks
DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks
Comprehensive, high-resolution binding energy landscapes reveal context dependencies of transcription factor binding
Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks
regBase: whole genome base-wise aggregation and functional prediction for human non-coding regulatory variants
RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach
Cross-Cell-Type Prediction of TF-Binding Site by Integrating Convolutional Neural Network and Adversarial Network
DeepATT: a hybrid category attention neural network for identifying functional effects of DNA sequences
Effect of sequence padding on the performance of deep learning models in archaeal protein functional prediction
DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants
Diverse motif ensembles specify non-redundant DNA binding activities of AP-1 family members in macrophages
Comprehensive evaluation of deep learning architectures for prediction of DNA/RNA sequence binding specificities
CL-PMI: A Precursor MicroRNA Identification Method Based on Convolutional and Long Short-Term Memory Networks
Enhancing the interpretability of transcription factor binding site prediction using attention mechanism
Integrating regulatory DNA sequence and gene expression to predict genome-wide chromatin accessibility across cellular contexts
MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks
Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure
DeepA-RBPBS: A hybrid convolution and recurrent neural network combined with attention mechanism for predicting RBP binding site
Predicting regulatory variants using a dense epigenomic mapped CNN model elucidated the molecular basis of trait-tissue associations
C-RNNCrispr: Prediction of CRISPR/Cas9 sgRNA activity using convolutional and recurrent neural networks
Convolutional Neural Networks Grouped by Transcription Factors for Predicting Protein-DNA Binding Site
Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.
Synthetic Genetic Array Analysis
Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.
Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.
Neural Activity: Imaging
Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.
Chronic Fatigue Syndrome
Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.
Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.
Cell Atlas of the Human Eye
Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.
Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.
STING Receptor Agonists
Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.