Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues

Nature Biotechnology
Jason Ernst, Manolis Kellis

Abstract

With hundreds of epigenomic maps, the opportunity arises to exploit the correlated nature of epigenetic signals, across both marks and samples, for large-scale prediction of additional datasets. Here, we undertake epigenome imputation by leveraging such correlations through an ensemble of regression trees. We impute 4,315 high-resolution signal maps, of which 26% are also experimentally observed. Imputed signal tracks show overall similarity to observed signals and surpass experimental datasets in consistency, recovery of gene annotations and enrichment for disease-associated variants. We use the imputed data to detect low-quality experimental datasets, to find genomic sites with unexpected epigenomic signals, to define high-priority marks for new experiments and to delineate chromatin states in 127 reference epigenomes spanning diverse tissues and cell types. Our imputed datasets provide the most comprehensive human regulatory region annotation to date, and our approach and the ChromImpute software constitute a useful complement to large-scale experimental mapping of epigenomic information.

References

Jun 8, 2001·Bioinformatics·O TroyanskayaR B Altman
Jul 27, 2001·Bioinformatics·Z Bar-JosephT S Jaakkola
Dec 24, 2004·The EMBO Journal·Antigone Kouskouti, Iannis Talianidis
Jul 1, 2005·Nature·David B SeligsonSiavash K Kurdistani
Jul 5, 2006·Proceedings of the National Academy of Sciences of the United States of America·Rajdeep DasMichael Q Zhang
Jul 29, 2008·Biochemical and Biophysical Research Communications·Shicai FanXuegong Zhang
Aug 23, 2008·Science·Gregory A HorwitzArnold J Berk
Aug 23, 2008·Science·Roberto FerrariSiavash K Kurdistani
Sep 19, 2008·Genome Biology·Yong ZhangX Shirley Liu
Feb 6, 2009·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Guo-Cheng Yuan
May 29, 2009·Proceedings of the National Academy of Sciences of the United States of America·Lucia A HindorffTeri A Manolio
May 30, 2009·Bioinformatics·Manuel GarberXiaohui Xie
Jan 12, 2010·Nature Biotechnology·Suzanne Harris
Feb 6, 2010·Proceedings of the National Academy of Sciences of the United States of America·Rosa KarlićMartin Vingron
Jun 3, 2010·Nature Reviews. Genetics·Jonathan Marchini, Bryan Howie
Jan 25, 2011·Nature Genetics·Sam JohnJohn A Stamatoyannopoulos
Mar 29, 2011·Nature·Jason ErnstBradley E Bernstein
Jul 5, 2011·Genes & Development·Anh Tram Nguyen, Yi Zhang
Oct 14, 2011·Nature·Kerstin Lindblad-TohManolis Kellis
Dec 1, 2011·Nature Methods·Xin ZhouTing Wang
Mar 1, 2012·Nature Methods·Jason Ernst, Manolis Kellis
Mar 20, 2012·Nature Methods·Michael M HoffmanWilliam Stafford Noble
Jun 23, 2012·Nature·Matthew F BarberKatrin F Chua
Sep 8, 2012·Nature·ENCODE Project Consortium
Sep 8, 2012·Nature·Robert E ThurmanJohn A Stamatoyannopoulos
Sep 8, 2012·Science·Matthew T MauranoJohn A Stamatoyannopoulos
Sep 8, 2012·Genome Research·Jennifer HarrowTim J Hubbard
Sep 8, 2012·Genome Research·Stephen G LandtMichael Snyder
Aug 3, 2013·Science·Haig A EskandarianMélanie A Hamon
Aug 9, 2013·Nature·Michael J ZillerAlexander Meissner
Sep 17, 2013·PLoS Computational Biology·Julia LasserreMartin Vingron
Oct 19, 2013·Science·Maya KasowskiMichael Snyder
Oct 19, 2013·Science·Graham McVickerJonathan K Pritchard
Nov 26, 2013·Nucleic Acids Research·Donna KarolchikW James Kent
Dec 18, 2013·Nucleic Acids Research·Pouya Kheradpour, Manolis Kellis
Mar 29, 2014·PLoS Computational Biology·Jian Zhou, Olga G Troyanskaya
Aug 28, 2014·Bioinformatics·John A Capra, Dennis Kostka
Feb 20, 2015·Nature·Roadmap Epigenomics ConsortiumManolis Kellis

Citations

Jan 26, 2016·Nature Neuroscience·Marit W VermuntMenno P Creyghton
Jul 21, 2015·American Journal of Human Genetics·Gleb Kichaev, Bogdan Pasaniuc
Feb 3, 2016·Nature Communications·Ake T LuSteve Horvath
Dec 1, 2015·Nature Neuroscience·Eilis HannonJonathan Mill
Oct 16, 2015·Nature Reviews. Genetics·Omer Schwartzman, Amos Tanay
Sep 5, 2015·Pharmacogenomics·Gerald A HigginsBrian D Athey
Mar 19, 2016·PeerJ·Julian ZubekDariusz M Plewczynski
Mar 17, 2016·Briefings in Bioinformatics·Ryuichiro Nakato, Katsuhiko Shirahige
Mar 11, 2016·Human Molecular Genetics·Niek VerweijPim Van Der Harst
Aug 20, 2015·The New England Journal of Medicine·Melina ClaussnitzerManolis Kellis
Jul 4, 2015·BioData Mining·Pedro Madrigal, Paweł Krajewski
May 24, 2016·Trends in Biotechnology·Christoph BockNathan C Sheffield
Jul 8, 2016·Annals of Clinical and Translational Neurology·Naim PanjwaniLisa J Strug
Feb 11, 2016·Bioinformatics·Yingying Wei, Hao Wu
Jun 17, 2016·Bioinformatics·Dat DuongEleazar Eskin
Nov 22, 2016·Nature Reviews. Genetics·Stefan H StrickerStephan Beck
Jan 24, 2017·Cell·Constantinos ChronisKathrin Plath
May 19, 2017·Nature Communications·Ake T LuSteve Horvath
Jun 7, 2017·Nature Communications·Isidro Cortes-CirianoPeter J Park
Apr 8, 2017·Nature Communications·Eugenio MarcoGuo-Cheng Yuan
Mar 13, 2018·Bioinformatics·Ryuichiro Nakato, Katsuhiko Shirahige
Apr 26, 2018·Multiple Sclerosis : Clinical and Laboratory Research·Yiyi Ma, Philip L De Jager
Oct 21, 2017·Nature Communications·Weiqiang ZhouHongkai Ji
Dec 30, 2017·Genome Biology·N BartonicekM E Dinger
Oct 10, 2015·Arteriosclerosis, Thrombosis, and Vascular Biology·Delphine GomezGary K Owens
Jun 28, 2016·Nature Communications·Emanuele LibertiniStephan Beck
Sep 26, 2018·Alcoholism, Clinical and Experimental Research·Andrew H SmithJoel Gelernter
Oct 19, 2018·American Journal of Respiratory Cell and Molecular Biology·Adel BoueizCOPDGene investigators, by Core Units:, ECLIPSE Investigators:, GenKOLS Investigators:
May 29, 2015·Nature Methods·Vivien Marx
Apr 8, 2015·Nature Biotechnology·Peter Ebert, Christoph Bock
Feb 20, 2015·Nature·Roadmap Epigenomics ConsortiumManolis Kellis
Nov 14, 2017·Nature Reviews. Genetics·Andrew E Teschendorff, Caroline L Relton
May 3, 2019·Human Molecular Genetics·Irfahan KassamAllan F McRae
Jul 10, 2019·Communications Biology·Adriana Arneson, Jason Ernst
Nov 5, 2019·PLoS Computational Biology·Yu Zhang, Shaun Mahony
Oct 29, 2019·Cytogenetic and Genome Research·Alexander KalmbachNuria C Bramswig
Sep 12, 2019·Epigenetics : Official Journal of the DNA Methylation Society·Kai FuMatteo Pellegrini
Oct 9, 2019·Human Molecular Genetics·Bryce van de GeijnAlkes L Price
Jul 19, 2017·BMC Genomics·Laura L ColbranJohn A Capra
Jun 28, 2018·Genome Research·John F FullardPanos Roussos
Jun 22, 2019·The Journal of Clinical Endocrinology and Metabolism·Sascha HeinitzAnke Tönjes
Nov 20, 2019·Bioinformatics·Artur Jaroszewicz, Jason Ernst
Sep 8, 2019·Nature Communications·Farhad HormozdiariAlkes L Price
Feb 25, 2017·PLoS Computational Biology·Qian Qin, Jianxing Feng
May 26, 2018·Annual Review of Genomics and Human Genetics·Sayantan DasBrian L Browning
Aug 19, 2018·Scientific Reports·Masataka KikuchiRyota Hashimoto
Jan 30, 2019·Nature Chemical Biology·Christian SchmidlChristoph Bock
Jan 12, 2019·Genome Biology·Jens KeilwagenJan Grau
Jan 24, 2019·Frontiers in Immunology·Andrea IannelloMarinella Clerico
Oct 9, 2019·BMC Bioinformatics·Chih-Hao FangSomali Chaterji
Dec 8, 2019·Nature Communications·Shilu ZhangSushmita Roy
Jan 10, 2020·FASEB Journal : Official Publication of the Federation of American Societies for Experimental Biology·Hidetaka WatanabeHidenobu Soejima
Aug 11, 2020·Circulation Research·Jennifer VanOudenhoveJustin L Cotney
Nov 17, 2016·Genome Biology·Kai WeiWilliam Stafford Noble
Feb 27, 2018·PLoS Genetics·Megan RoytmanBogdan Pasaniuc
Mar 9, 2018·Frontiers in Molecular Neuroscience·Martin BeckerSonja C Vernes
Jun 13, 2018·Nature Genetics·Carolina RoselliPatrick T Ellinor
Mar 27, 2018·Briefings in Functional Genomics·Shan Jiang, Ali Mortazavi
Aug 17, 2020·Nucleic Acids Research·Morgan A SammonsMartin Fischer
Nov 10, 2017·Nature Protocols·Jason Ernst, Manolis Kellis
Mar 23, 2019·Genome Biology·Chantriolnt-Andreas Kapourani, Guido Sanguinetti
May 31, 2020·Annals of the New York Academy of Sciences·Jennifer CableRudolph E Tanzi
Apr 12, 2017·Genome Biology·Christof AngermuellerOliver Stegle
Sep 9, 2017·Bioinformatics·Pang Wei KohAnshul Kundaje
Oct 14, 2017·Bioinformatics·Elena D StavrovskayaAndrey A Mironov
Jun 26, 2018·Briefings in Bioinformatics·Mikhail G Dozmorov
Apr 13, 2018·Nature Communications·Timothy DurhamWilliam Stafford Noble
Nov 28, 2018·Nature Materials·Andrew E Teschendorff
May 3, 2019·Nucleic Acids Research·Xinzhou GeJingyi Jessica Li
Aug 25, 2018·Nature Communications·Pradeep NatarajanNHLBI TOPMed Lipids Working Group
Sep 19, 2020·Nature Communications·Kushal K DeyAlkes L Price
Dec 5, 2015·Briefings in Bioinformatics·Dimitrios KleftogiannisVladimir B Bajic
Sep 11, 2019·BMC Medical Genomics·Masataka KikuchiAkihiro Nakaya
Jan 13, 2018·Human Molecular Genetics·Tojo JamesIngrid Kockum
Jun 20, 2020·American Journal of Respiratory and Critical Care Medicine·Yuan HaoXiaobo Zhou
Nov 25, 2020·Bioinformatics·Peter Ebert, Marcel H Schulz
Nov 29, 2020·Molecular Therapy : the Journal of the American Society of Gene Therapy·Laura P SpectorMark A Kay
Apr 3, 2020·Methods : a Companion to Methods in Enzymology·Ryuichiro Nakato, Toyonori Sakata
Nov 21, 2020·Genome Biology·Jacob SchreiberWilliam Stafford Noble
Nov 17, 2020·Frontiers in Genetics·Meng SongHong-Wen Deng
Sep 9, 2017·American Journal of Human Genetics·Lea M StaritaDouglas M Fowler
Jan 12, 2020·Communications Biology·Adriana Arneson, Jason Ernst
Nov 24, 2018·An International Journal on Information Fusion·Marinka ZitnikMichael M Hoffman
Mar 16, 2021·Frontiers in Cellular and Infection Microbiology·Michael MarianiSeth Frietze
May 6, 2021·Journal of Personalized Medicine·Bram Peter PrinsHarold Snieder
May 5, 2021·Nature Communications·Soo Bin Kwon, Jason Ernst
Jun 9, 2021·Human Genome Variation·Masataka KikuchiAkihiro Nakaya
Jun 10, 2021·Current Opinion in Chemical Biology·Jacob Schreiber, Ritambhara Singh

Methods Mentioned

BETA
RNA-seq
acetylation

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Hereditary Sensory Autonomic Neuropathy

Hereditary Sensory Autonomic Neuropathies are a group of inherited neurodegenerative disorders characterized clinically by loss of sensation and autonomic dysfunction. Here is the latest research on these neuropathies.

Spatio-Temporal Regulation of DNA Repair

DNA repair is a complex process regulated by several different classes of enzymes, including ligases, endonucleases, and polymerases. This feed focuses on the spatial and temporal regulation that accompanies DNA damage signaling and repair enzymes and processes.

Glut1 Deficiency

Glut1 deficiency, an autosomal dominant, genetic metabolic disorder associated with a deficiency of GLUT1, the protein that transports glucose across the blood brain barrier, is characterized by mental and motor developmental delays and infantile seizures. Follow the latest research on Glut1 deficiency with this feed.

Separation Anxiety

Separation anxiety is a type of anxiety disorder that involves excessive distress and anxiety with separation. This may include separation from places or people to which they have a strong emotional connection with. It often affects children more than adults. Here is the latest research on separation anxiety.

KIF1A Associated Neurological Disorder

KIF1A associated neurological disorder (KAND) is a rare neurodegenerative condition caused by mutations in the KIF1A gene. KAND may present with a wide range and severity of symptoms including stiff or weak leg muscles, low muscle tone, a lack of muscle coordination and balance, and intellectual disability. Find the latest research on KAND here.

Regulation of Vocal-Motor Plasticity

Dopaminergic projections to the basal ganglia and nucleus accumbens shape the learning and plasticity of motivated behaviors across species including the regulation of vocal-motor plasticity and performance in songbirds. Discover the latest research on the regulation of vocal-motor plasticity here.