Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments

Nucleic Acids Research
Pouya Kheradpour, Manolis Kellis

Abstract

Recent advances in technology have led to a dramatic increase in the number of available transcription factor ChIP-seq and ChIP-chip data sets. Understanding the motif content of these data sets is an important step in understanding the underlying mechanisms of regulation. Here we provide a systematic motif analysis for 427 human ChIP-seq data sets using motifs curated from the literature and also discovered de novo using five established motif discovery tools. We use a systematic pipeline for calculating motif enrichment in each data set, providing a principled way for choosing between motif variants found in the literature and for flagging potentially problematic data sets. Our analysis confirms the known specificity of 41 of the 56 analyzed factor groups and reveals motifs of potential cofactors. We also use cell type-specific binding to find factors active in specific conditions. The resource we provide is accessible both for browsing a small number of factors and for performing large-scale systematic analyses. We provide motif matrices, instances and enrichments in each of the ENCODE data sets. The motifs discovered here have been used in parallel studies to validate the specificity of antibodies, understand cooperativity ...Continue Reading

References

Feb 15, 1992·Proceedings of the National Academy of Sciences of the United States of America·H DudekE P Reddy
Aug 1, 1995·Molecular and Cellular Biology·M KawanaT Quertermous
Jan 1, 1994·Annual Review of Immunology·P A Baeuerle, T Henkel
Sep 1, 1996·Genes & Development·W WangG R Crabtree
Apr 1, 1997·Current Opinion in Cell Biology·M KarinE Zandi
Sep 1, 1997·Molecular and Cellular Biology·J M TaylorC A Peterson
Apr 16, 1998·Genes & Development·J PengD H Price
Aug 10, 1999·Seminars in Cell & Developmental Biology·C A Johnson, B M Turner
Jan 11, 2000·Nucleic Acids Research·K D Pruitt, D R Maglott
Feb 15, 2001·Oncogene·V I Sementchenko, D K Watson
Jul 21, 2001·Molecular Cell·V N IvanovZ Ronai
Jul 27, 2001·Bioinformatics·Z Bar-JosephT S Jaakkola
Jul 27, 2001·Bioinformatics·G PavesiG Pesole
Jan 24, 2002·Proceedings of the National Academy of Sciences of the United States of America·Benjamin P BermanMichael B Eisen
Jun 5, 2002·Genome Research·W James KentDavid Haussler
Jan 10, 2003·Nucleic Acids Research·V MatysE Wingender
Apr 2, 2003·Human Molecular Genetics·Chu-Xia Deng, Rui-Hong Wang
May 29, 2003·The Journal of Biological Chemistry·Giuseppina CarettiRoberto Mantovani
Dec 4, 2003·Hepatology : Official Journal of the American Association for the Study of Liver Diseases·Robert H CostaXinhe Wang
Dec 19, 2003·Nucleic Acids Research·Albin SandelinBoris Lenhard
Jun 3, 2004·Genome Research·Gavin E CrooksSteven E Brenner
Sep 2, 2004·PLoS Biology·Mark D SchroederUlrike Gaul
Dec 21, 2004·Nucleic Acids Research·Amos BairochLai-Su L Yeh
Apr 9, 2005·Bioinformatics·Dongsheng CheJun S Liu
Jun 17, 2005·Nature·Catherine S LeeKlaus H Kaestner
Sep 13, 2005·Cell·Laurie A BoyerRichard A Young

❮ Previous
Next ❯

Citations

Mar 19, 2014·Trends in Genetics : TIG·Hamid Bolouri
Nov 21, 2015·Nucleic Acids Research·Ivan V KulakovskiyVsevolod J Makeev
Oct 28, 2015·Nucleic Acids Research·Liyuan GuoJing Wang
Mar 15, 2016·Trends in Genetics : TIG·Sebo WithoffCisca Wijmenga
Mar 17, 2016·Statistical Applications in Genetics and Molecular Biology·Duchwan RyuRobert H Podolsky
Jun 28, 2015·Nucleic Acids Research·Jens Keilwagen, Jan Grau
Jun 21, 2015·Bioinformatics·Chandler ZuoSündüz Keleş
Feb 18, 2016·IEEE/ACM Transactions on Computational Biology and Bioinformatics·Lin ZhuDe-Shuang Huang
Sep 12, 2015·Annual Review of Cell and Developmental Biology·Dawn ThompsonSushmita Roy
Oct 16, 2014·Human Molecular Genetics·Peter J CastaldiCraig P Hersh
Mar 24, 2015·Nature Methods·Ricardo Cruz-Herrera del RosarioShyam Prabhakar
Feb 1, 2015·Trends in Genetics : TIG·Anthony MathelierWyeth W Wasserman
Mar 1, 2015·Database : the Journal of Biological Databases and Curation·Yiyu ZhengHaiyan Hu
May 8, 2016·American Journal of Human Genetics·Catherine DoBenjamin Tycko
May 21, 2016·Cell·Ronan C O'MalleyJoseph R Ecker
May 24, 2016·Cell Reports·Geoffrey FudenbergLeonid A Mirny
Oct 22, 2016·Trends in Genetics : TIG·Michael R Brent
Feb 1, 2017·Proceedings of the National Academy of Sciences of the United States of America·Sharon R GrossmanEric S Lander
Mar 7, 2017·Nature Communications·Burak H AlverCharles W M Roberts

❮ Previous
Next ❯

Datasets Mentioned

BETA
GM12878

Methods Mentioned

BETA
immunoprecipitation
ChIP-chip
SELEX
footprinting

Software Mentioned

WebLogo
Transfac
ENCODE

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.