A framework for analyzing DNA methylation data from Illumina Infinium HumanMethylation450 BeadChip

BMC Bioinformatics
Zhenxing WangYadong Wang

Abstract

DNA methylation has been identified to be widely associated to complex diseases. Among biological platforms to profile DNA methylation in human, the Illumina Infinium HumanMethylation450 BeadChip (450K) has been accepted as one of the most efficient technologies. However, challenges exist in analysis of DNA methylation data generated by this technology due to widespread biases. Here we proposed a generalized framework for evaluating data analysis methods for Illumina 450K array. This framework considers the following steps towards a successful analysis: importing data, quality control, within-array normalization, correcting type bias, detecting differentially methylated probes or regions and biological interpretation. We evaluated five methods using three real datasets, and proposed outperform methods for the Illumina 450K array data analysis. Minfi and methylumi are optimal choice when analyzing small dataset. BMIQ and RCP are proper to correcting type bias and the normalized result of them can be used to discover DMPs. R package missMethyl is suitable for GO term enrichment analysis and biological interpretation.

References

Mar 7, 2007·Nature Reviews. Genetics·Manel Esteller
Jun 6, 2009·Genome Research·Yasuo KogaSherman M Weissman
Mar 4, 2010·Genome Biology·Mark D Robinson, Alicia Oshlack
Apr 16, 2010·Nature·Thomas J HudsonHuanming Yang
Jul 2, 2011·Nature·Cancer Genome Atlas Research Network
Aug 16, 2011·Genomics·Marina BibikovaRichard Shen
Dec 1, 2011·Epigenomics·Sarah DedeurwaerderFrançois Fuks
Oct 5, 2012·Genome Biology·Kasper D HansenRafael A Irizarry
Jan 15, 2013·Epigenetics : Official Journal of the DNA Methylation Society·Yi-an ChenRosanna Weksberg
Mar 12, 2013·Nucleic Acids Research·Timothy J TricheKimberly D Siegmund
Apr 20, 2013·Nucleic Acids Research·Charles D WardenYate-Ching Yuan
May 2, 2013·BMC Genomics·Ruth PidsleyLeonard C Schalkwyk
May 3, 2013·Nature·Cyriac KandothDouglas A Levine
Jun 5, 2013·Bioinformatics·Paul GeeleherCathal Seoighe
Aug 31, 2013·Briefings in Bioinformatics·Sarah DedeurwaerderFrançois Fuks
Mar 13, 2014·Bioinformatics·Peter A StockwellIan M Morison
Apr 1, 2014·Molecular Systems Biology·Yevgeniy GindinSven Bilke
Jan 1, 2013·F1000Research·Mike L SmithKasper D Hansen
May 20, 2014·Bioinformatics·Yongseok ParkMaureen A Sartor
Sep 19, 2015·Nucleic Acids Research·Zongli XuJack A Taylor
Aug 16, 2017·Methods : a Companion to Methods in Enzymology·Jiajie PengJin Chen
Oct 3, 2017·Briefings in Bioinformatics·Liang ChengMeng Zhou
Jan 4, 2018·BMC Bioinformatics·Jiajie PengXuequn Shang

Citations

Mar 8, 2019·The Journal of Pathology·Tanjina KaderKylie L Gorringe
May 29, 2018·Epigenetics & Chromatin·Thadeous J KacmarczykDoron Betel
Aug 28, 2020·American Journal of Human Biology : the Official Journal of the Human Biology Council·Calen P Ryan
Sep 24, 2020·Epigenetics : Official Journal of the DNA Methylation Society·Hyeon-Kyoung KooDawn L DeMeo
Jun 27, 2019·World Journal of Gastroenterology : WJG·Ji-Bin LiYong-Peng Wang
Oct 13, 2020·Nucleic Acids Research·Jiayi YinFeng Zhu
Jan 8, 2021·Epigenomics·Diana L Juvinao-QuinteroMarie-France Hivert

Related Concepts

CpG Clusters
DNA Methylation
Cdna Microarrays
Online Mendelian Inheritance In Man
Gene Ontology Project
Evaluation
Renal Allotransplantation, Implantation of Graft; With Recipient Nephrectomy
DNA Methylation
Analysis
2',6'-dimethylphenylalanine

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Synthetic Genetic Array Analysis

Synthetic genetic arrays allow the systematic examination of genetic interactions. Here is the latest research focusing on synthetic genetic arrays and their analyses.

Congenital Hyperinsulinism

Congenital hyperinsulinism is caused by genetic mutations resulting in excess insulin secretion from beta cells of the pancreas. Here is the latest research.

Neural Activity: Imaging

Imaging of neural activity in vivo has developed rapidly recently with the advancement of fluorescence microscopy, including new applications using miniaturized microscopes (miniscopes). This feed follows the progress in this growing field.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Epigenetic Memory

Epigenetic memory refers to the heritable genetic changes that are not explained by the DNA sequence. Find the latest research on epigenetic memory here.

Cell Atlas of the Human Eye

Constructing a cell atlas of the human eye will require transcriptomic and histologic analysis over the lifespan. This understanding will aid in the study of development and disease. Find the latest research pertaining to the Cell Atlas of the Human Eye here.

Femoral Neoplasms

Femoral Neoplasms are bone tumors that arise in the femur. Discover the latest research on femoral neoplasms here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.