An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data

BMC Bioinformatics
Garrett JenkinsonJohn Goutsias

Abstract

DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigoro...Continue Reading

References

Apr 5, 2018·BMC Bioinformatics·Kevin Yu Yuan HuangPao-Yang Chen
Mar 19, 2020·BMC Bioinformatics·Konrad Grosser, Dirk Metzler
Apr 10, 2019·BMC Bioinformatics·Garrett JenkinsonJohn Goutsias
Mar 3, 2020·Epigenetics : Official Journal of the DNA Methylation Society·Michael A KoldobskiyAndrew P Feinberg
Oct 18, 2020·Nature Communications·J AbanteJ Goutsias

Citations

Apr 22, 1982·Proceedings of the Royal Society of London. Series B, Containing Papers of a Biological Character·S Duane, C L Huang
Oct 29, 2004·The Journal of Biological Chemistry·Giedrius VilkaitisShoji Tajima
Nov 3, 2005·British Journal of Cancer·P U OnganerM B A Djamgoz
Oct 31, 2006·Nature Genetics·Florian EckhardtStephan Beck
Nov 30, 2006·Nucleic Acids Research·Axel ViselLen A Pennacchio
Oct 30, 2007·Epigenetics : Official Journal of the DNA Methylation Society·Albert Jeltsch
Apr 21, 2009·FEBS Letters·Robert S Illingworth, Adrian P Bird
Dec 17, 2009·Nature Biotechnology·William S Noble
Jan 19, 2010·Proceedings of the National Academy of Sciences of the United States of America·Andrew P Feinberg, Rafael A Irizarry
Mar 10, 2010·Biostatistics·Hao WuAndrew P Feinberg
Sep 19, 2012·Nature Reviews. Genetics·Christoph Bock
Oct 5, 2012·Genome Biology·Kasper D HansenRafael A Irizarry
Mar 7, 2013·Nature Structural & Molecular Biology·Yehudit Bergman, Howard Cedar
Feb 26, 2014·Genome Biology·Deqiang SunWei Li
Mar 25, 2014·American Journal of Human Genetics·Yun LiuAndrew P Feinberg
May 20, 2014·Bioinformatics·Yongseok ParkMaureen A Sartor
Sep 28, 2014·Genome Biology·Sheng LiChristopher E Mason
Oct 4, 2014·Frontiers in Genetics·Mark D RobinsonXiaobei Zhou
Jan 17, 2015·Nature·Dirk Schübeler
May 7, 2015·BMC Bioinformatics·Peijie LinConrad J Burden
May 7, 2016·Bioinformatics·Yusuke MatsuiTeppei Shimamura
Mar 28, 2017·Nature Genetics·Garrett JenkinsonAndrew P Feinberg
Dec 22, 2017·Signal Transduction and Targeted Therapy·Ran LuXianming Mo

Related Concepts

Genome-Wide Association Study
In Silico
Hydrogen sulfite
DNA Methylation [PE]
H2S(D2S)
Protein Methylation
Lung
Genome
Genomic Profile
DNA Methylation

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Sexual Dimorphism in Neurodegeneration

There exist sex differences in neurodevelopmental and neurodegenerative disorders. For instance, multiple sclerosis is more common in women, whereas Parkinson’s disease is more common in men. Here is the latest research on sexual dimorphism in neurodegeneration

HLA Genetic Variation

HLA genetic variation has been found to confer risk for a wide variety of diseases. Identifying these associations and understanding their molecular mechanisms is ongoing and holds promise for the development of therapeutics. Find the latest research on HLA genetic variation here.

Super-resolution Microscopy

Super-resolution microscopy is the term commonly given to fluorescence microscopy techniques with resolutions that are not limited by the diffraction of light. Here are the latest discoveries pertaining to super-resolution microscopy.

Genetic Screens in iPSC-derived Brain Cells

Genetic screening is a critical tool that can be employed to define and understand gene function and interaction. This feed focuses on genetic screens conducted using induced pluripotent stem cell (iPSC)-derived brain cells.

Brain Lower Grade Glioma

Low grade gliomas in the brain form from oligodendrocytes and astrocytes and are the slowest-growing glioma in adults. Discover the latest research on these brain tumors here.

CD4/CD8 Signaling

Cluster of differentiation 4 and 8 (CD8 and CD8) are glycoproteins founds on the surface of immune cells. Here is the latest research on their role in cell signaling pathways.

Alignment-free Sequence Analysis Tools

Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. Here is the latest research.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.