Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk

Nature Genetics
Jian ZhouOlga G Troyanskaya

Abstract

Key challenges for human genetics, precision medicine and evolutionary biology include deciphering the regulatory code of gene expression and understanding the transcriptional effects of genome variation. However, this is extremely difficult because of the enormous scale of the noncoding mutation space. We developed a deep learning-based framework, ExPecto, that can accurately predict, ab initio from a DNA sequence, the tissue-specific transcriptional effects of mutations, including those that are rare or that have not been observed. We prioritized causal variants within disease- or trait-associated loci from all publicly available genome-wide association studies and experimentally validated predictions for four immune-related diseases. By exploiting the scalability of ExPecto, we characterized the regulatory mutation space for human RNA polymerase II-transcribed genes by in silico saturation mutagenesis and profiled > 140 million promoter-proximal mutations. This enables probing of evolutionary constraints on gene expression and ab initio prediction of mutation disease effects, making ExPecto an end-to-end computational framework for the in silico prediction of expression and disease risk.

References

Feb 15, 2001·Nature Genetics·H J BussemakerE D Siggia
Dec 11, 2002·British Journal of Haematology·Keiko NagaizumiKatsuyuki Fukutake
Apr 16, 2004·Cell·Michael A Beer, Saeed Tavazoie
Oct 7, 2004·The Journal of Experimental Medicine·Yen-Shing NgEric Meffre
Jun 8, 2007·Nature·UNKNOWN Wellcome Trust Case Control Consortium
Dec 7, 2007·PLoS Computational Biology·Yuan YuanJun S Liu
Jan 24, 2009·Nature Genetics·Robert R GrahamPatrick M Gaffney
Apr 8, 2009·Genome Medicine·Peter D StensonDavid N Cooper
Mar 12, 2010·Nature·Joseph K PickrellJonathan K Pritchard
Oct 15, 2010·Nature Biotechnology·Bradley E BernsteinJames A Thomson
Aug 13, 2011·Nature·UNKNOWN International Multiple Sclerosis Genetics ConsortiumAlastair Compston
Aug 16, 2011·Nature Genetics·Xun ChuUNKNOWN China Consortium for Genetics of Autoimmune Thyroid Disease
Oct 8, 2011·PloS One·Marine GermainPierre-Emmanuel Morange
Jan 3, 2013·Nature Methods·Natalie de Souza
Jan 26, 2013·Science·Franklin W HuangLevi A Garraway
May 30, 2013·Nature Genetics·UNKNOWN GTEx Consortium
Jul 28, 2013·Nature Communications·João VinagrePaula Soares
Nov 12, 2013·American Journal of Human Genetics·Stacey L EdwardsAlison M Dunning
Mar 29, 2014·Nature·UNKNOWN FANTOM Consortium and the RIKEN PMI and CLST (DGT)Yoshihide Hayashizaki
Sep 1, 2014·Nature Neuroscience·Adaikalavan RamasamyMichael E Weale
Jan 24, 2015·Science·Mathias UhlénFredrik Pontén
Mar 25, 2015·Hepatology : Official Journal of the American Association for the Study of Liver Diseases·De-Ke JiangLong Yu
Aug 11, 2015·Nature Genetics·Eric R GamazonHae Kyung Im
Aug 25, 2015·Nature Methods·Jian Zhou, Olga G Troyanskaya
Oct 4, 2015·Nature·UNKNOWN 1000 Genomes Project ConsortiumGonçalo R Abecasis
Nov 15, 2016·Nature Reviews. Genetics·Bogdan Pasaniuc, Alkes L Price
Dec 3, 2016·Nucleic Acids Research·Jacqueline MacArthurHelen Parkinson
Oct 13, 2017·Nature·Xin LiStephen B Montgomery
Oct 13, 2017·Nature·UNKNOWN GTEx ConsortiumStephen B Montgomery

❮ Previous
Next ❯

Citations

Nov 28, 2018·Nature Genetics·James ZouAmalio Telenti
May 28, 2019·Nature Methods·Erick MoenDavid Van Valen
Jul 25, 2019·Genes·Miguel Pérez-Enciso, Laura M Zingaretti
Jul 10, 2019·Human Mutation·Zhiqiang HuSteven E Brenner
Oct 30, 2018·Nucleic Acids Research·Philipp RentzschMartin Kircher
Mar 7, 2020·Cancer Science·Hideyuki Shimizu, Keiichi I Nakayama
Jun 3, 2020·Briefings in Bioinformatics·Stephan StruckmannSteffen Möller
Jun 10, 2020·PLoS Computational Biology·David LamparterVictor Hanson-Smith
Jul 21, 2020·PLoS Computational Biology·David R Kelley
Sep 7, 2019·European Journal of Human Genetics : EJHG·Tatsuo MasudaYukinori Okada
Dec 14, 2018·TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·Guillaume P RamsteinEdward S Buckler
Jun 30, 2019·Nature Genetics·Joris van ArensbergenBas van Steensel
Apr 24, 2020·Frontiers in Genetics·Zhilan Li, Zhiming Dai
Nov 21, 2019·Genome Medicine·Raquel Dias, Ali Torkamani
Jan 25, 2020·Nucleic Acids Research·Mike PhuycharoenMagnus Rattray
Dec 20, 2019·BioEssays : News and Reviews in Molecular, Cellular and Developmental Biology·Louis GauthierStephen W Michnick
Sep 19, 2020·Nature Communications·Kushal K DeyAlkes L Price
Jun 6, 2019·Molecular Psychiatry·P Alexander ArguelloThomas Lehner
Apr 12, 2019·Nature Reviews. Genetics·Gökcen EraslanFabian J Theis
Jul 17, 2020·Nature Machine Intelligence·Shinya TasakiYanling Wang
Nov 18, 2018·Nucleic Acids Research·Xianfeng LiFengbiao Mao
Mar 21, 2020·NPJ Genomic Medicine·Volker M Lauschke, Magnus Ingelman-Sundberg
May 20, 2020·Nature Reviews. Nephrology·Sean EddyMatthias Kretzler
Sep 27, 2020·Proceedings of the National Academy of Sciences of the United States of America·Alexandra MaslovaUNKNOWN Immunological Genome Project
Apr 25, 2020·Bioinformatics·Zhenqin WuJames Zou
Nov 3, 2020·PLoS Computational Biology·Ling Chen, John A Capra
Dec 22, 2020·Frontiers in Genetics·Abolfazl Doostparast TorshiziKai Wang
Jul 3, 2020·Computational and Structural Biotechnology Journal·Barbara HöllbacherN Henriette Uhlenhaut
Oct 15, 2020·Science China. Life Sciences·Jianxiao LiuJianbing Yan
Oct 16, 2020·Chemical Biology & Drug Design·Firoozeh PiroozmandHedieh Sajedi

❮ Previous
Next ❯

Methods Mentioned

BETA
sequence-based prediction
RNA-seq
protein folding
transfection

Software Mentioned

Genewiz
DeepSEA
ExPecto
GTEx
mathop
PLINK
locfdr R package

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.