Integrative modeling reveals key chromatin and sequence signatures predicting super-enhancers

Scientific Reports
Aziz Khan, Xuegong Zhang

Abstract

Super-enhancers (SEs) are clusters of transcriptional enhancers which control the expression of cell identity and disease-associated genes. Current studies demonstrated the role of multiple factors in SE formation; however, a systematic analysis to assess the relative predictive importance of chromatin and sequence features of SEs and their constituents is lacking. In addition, a predictive model that integrates various types of data to predict SEs has not been established. Here, we integrated diverse types of genomic and epigenomic datasets to identify key signatures of SEs and investigated their predictive importance. Through integrative modeling, we found Cdk8, Cdk9, and Smad3 as new features of SEs, which can define known and new SEs in mouse embryonic stem cells and pro-B cells. We compared six state-of-the-art machine learning models to predict SEs and showed that non-parametric ensemble models performed better as compared to parametric. We validated these models using cross-validation and also independent datasets in four human cell-types. Taken together, our systematic analysis and ranking of features can be used as a platform to define and understand the biology of SEs in other cell-types.

References

Aug 29, 1998·The Journal of Biological Chemistry·C PouponnotJ Massagué
Dec 31, 2005·Nucleic Acids Research·V MatysE Wingender
Sep 19, 2008·Genome Biology·Yong ZhangX Shirley Liu
Feb 13, 2009·Nature·Axel ViselLen A Pennacchio
Mar 6, 2009·Genome Biology·Ben LangmeadSteven L Salzberg
Dec 10, 2009·Proceedings of the National Academy of Sciences of the United States of America·Zhengqing OuyangWing Hung Wong
Jan 26, 2010·Nature Structural & Molecular Biology·Aaron J DonnerJoaquín M Espinosa
May 4, 2010·Cell·Peter B RahlRichard A Young
Aug 20, 2010·Nature·Michael H KageyRichard A Young
Nov 26, 2010·Proceedings of the National Academy of Sciences of the United States of America·Menno P CreyghtonRudolf Jaenisch
Mar 2, 2011·Nature Reviews. Genetics·Chin-Tong Ong, Victor G Corces
Mar 19, 2011·Cell·Richard A Young
May 24, 2011·Nature·Anton ValouevArend Sidow
Jan 11, 2012·Epigenetics & Chromatin·Petros KolovosArgyris Papantonis
Jun 23, 2012·Nature Reviews. Cancer·Anna C Belkina, Gerald V Denis
Sep 8, 2012·Nature·Robert E ThurmanJohn A Stamatoyannopoulos
Sep 8, 2012·Genome Research·Anirudh NatarajanUwe Ohler
Oct 23, 2012·The Journal of Biological Chemistry·Weishi ZhangKeh-Chuang Chin
Jun 19, 2013·Nucleic Acids Research·Christopher Fletez-BrantMichael A Beer
Oct 15, 2013·Cell·Denes HniszRichard A Young
Oct 16, 2013·Proceedings of the National Academy of Sciences of the United States of America·Stephen C J ParkerUNKNOWN NISC Comparative Sequencing Program Authors
Dec 11, 2013·Trends in Cell Biology·Joseph A WamstadLaurie A Boyer
Jan 15, 2014·Nature Genetics·Lorenzo PasqualiJorge Ferrer
Mar 13, 2014·Nature Reviews. Genetics·Daria ShlyuevaAlexander Stark
Apr 1, 2014·Cell·Michael LevineRobert Tjian
May 27, 2014·Cell Reports·Rasmus SiersbækSusanne Mandrup
May 27, 2014·Nucleic Acids Research·Friederike ItzenMatthias Geyer

❮ Previous
Next ❯

Citations

Dec 2, 2020·European Journal of Medicinal Chemistry·Dan WuXinhua Liu

❮ Previous
Next ❯

Methods Mentioned

BETA
immunoprecipitation
acetylation
ChIP-seq
RNA-seq

Software Mentioned

R
learn
UCSC table browser
. plot
MACS ( Model - based Analysis of ChIP - Seq )
Random
phastCons
bowtie 67
ngs
LibSVM

Related Concepts

Related Feeds

CZI Human Cell Atlas Seed Network

The aim of the Human Cell Atlas (HCA) is to build reference maps of all human cells in order to enhance our understanding of health and disease. The Seed Networks for the HCA project aims to bring together collaborators with different areas of expertise in order to facilitate the development of the HCA. Find the latest research from members of the HCA Seed Networks here.