Prediction of enhancer-promoter interactions via natural language processing

BMC Genomics
Wanwen ZengRui Jiang

Abstract

Precise identification of three-dimensional genome organization, especially enhancer-promoter interactions (EPIs), is important to deciphering gene regulation, cell differentiation and disease mechanisms. Currently, it is a challenging task to distinguish true interactions from other nearby non-interacting ones since the power of traditional experimental methods is limited due to low resolution or low throughput. We propose a novel computational framework EP2vec to assay three-dimensional genomic interactions. We first extract sequence embedding features, defined as fixed-length vector representations learned from variable-length sequences using an unsupervised deep learning method in natural language processing. Then, we train a classifier to predict EPIs using the learned representations in supervised way. Experimental results demonstrate that EP2vec obtains F1 scores ranging from 0.841~ 0.933 on different datasets, which outperforms existing methods. We prove the robustness of sequence embedding features by carrying out sensitivity analysis. Besides, we identify motifs that represent cell line-specific information through analysis of the learned sequence embedding features by adopting attention mechanism. Last, we show that ...Continue Reading

References

Feb 16, 2002·Science·Job DekkerNancy Kleckner
Apr 9, 2005·The Journal of Biological Chemistry·Joseph N McLaughlinHeidi E Hamm
Jun 25, 2008·Nature Reviews. Cancer·Robert G Ramsay, Thomas J Gonda
Nov 6, 2009·Nature·Melissa J FullwoodYijun Ruan
Dec 17, 2009·PLoS Computational Biology·Daniel RamsköldRickard Sandberg
Oct 15, 2010·Nature Biotechnology·Bradley E BernsteinJames A Thomson
Nov 19, 2011·Nature Reviews. Genetics·Manel Esteller
Mar 1, 2012·Nature Methods·Jason Ernst, Manolis Kellis
Mar 20, 2012·Nature Methods·Michael M HoffmanWilliam Stafford Noble
May 1, 2012·Human Molecular Genetics·Scott SmemoMarcelo A Nobrega
May 23, 2012·Nucleic Acids Research·Timothy L Bailey, Philip Machanick
Sep 8, 2012·Genome Research·Jennifer HarrowTim J Hubbard
Nov 16, 2013·Bioinformatics·Alvaro Sebastian, Bruno Contreras-Moreira
Mar 13, 2014·Nature Reviews. Genetics·Daria ShlyuevaAlexander Stark
Mar 29, 2014·Nature·Robin AnderssonAlbin Sandelin
May 14, 2014·Proceedings of the National Academy of Sciences of the United States of America·Bing HeKai Tan
Jul 18, 2014·PLoS Computational Biology·Mahmoud GhandiMichael A Beer
Feb 15, 2015·Journal of Molecular Cell Biology·Rui Jiang
Jul 28, 2015·Nature Biotechnology·Babak AlipanahiBrendan J Frey
Aug 25, 2015·Nature Methods·Jian Zhou, Olga G Troyanskaya
Sep 5, 2015·Nucleic Acids Research·Sushmita RoyRupa Sridharan
Nov 21, 2015·Nucleic Acids Research·Ivan V KulakovskiyVsevolod J Makeev
Mar 11, 2016·Nature Communications·Yun ZhuWei Wang
Jun 4, 2017·Proceedings of the National Academy of Sciences of the United States of America·Zhana DurenWing Hung Wong
Sep 28, 2017·BMC Systems Biology·Mingxin GanRui Jiang

❮ Previous
Next ❯

Citations

Nov 28, 2018·Nature Genetics·James ZouAmalio Telenti
Dec 6, 2019·Genome Research·Polina S BelokopytovaVeniamin Fishman
Jan 29, 2020·Nature Reviews. Genetics·Molly GasperiniJay Shendure
Oct 13, 2020·Nucleic Acids Research·Wanwen ZengRui Jiang
Apr 6, 2021·PeerJ. Computer Science·Michal B RozenwaldMikhail S Gelfand
May 23, 2021·Briefings in Bioinformatics·Amlan TalukderHaiyan Hu
Jun 19, 2021·Computational and Structural Biotechnology Journal·Hitoshi IuchiMichiaki Hamada
Jul 27, 2021·The Plant Genome·Osval Antonio Montesinos-LópezJosé Crossa
Aug 8, 2021·International Journal of Molecular Sciences·Kinga SzymanMichał Dąbrowski

❮ Previous
Next ❯

Datasets Mentioned

BETA
GM12878

Methods Mentioned

BETA
interaction prediction
sequence-based prediction
interactions prediction
feature extraction
RNA-seq

Software Mentioned

EpiTensor
Epigenomics ChromHMM
SPEID
Hi
GBRT
word2vec
DeepBind
FANTOM5
IM
CentriMo

Related Concepts

Related Feeds

CREs: Gene & Cell Therapy

Gene and cell therapy advances have shown promising outcomes for several diseases. The role of cis-regulatory elements (CREs) is crucial in the design of gene therapy vectors. Here is the latest research on CREs in gene and cell therapy.