Assessing the model transferability for prediction of transcription factor binding sites based on chromatin accessibility

BMC Bioinformatics
Sheng LiuJiang Qian

Abstract

Computational prediction of transcription factor (TF) binding sites in different cell types is challenging. Recent technology development allows us to determine the genome-wide chromatin accessibility in various cellular and developmental contexts. The chromatin accessibility profiles provide useful information in prediction of TF binding events in various physiological conditions. Furthermore, ChIP-Seq analysis was used to determine genome-wide binding sites for a range of different TFs in multiple cell types. Integration of these two types of genomic information can improve the prediction of TF binding events. We assessed to what extent a model built upon on other TFs and/or other cell types could be used to predict the binding sites of TFs of interest. A random forest model was built using a set of cell type-independent features such as specific sequences recognized by the TFs and evolutionary conservation, as well as cell type-specific features derived from chromatin accessibility data. Our analysis suggested that the models learned from other TFs and/or cell lines performed almost as well as the model learned from the target TF in the cell type of interest. Interestingly, models based on multiple TFs performed better than ...Continue Reading

References

Dec 11, 1999·Nucleic Acids Research·E WingenderF Schacherer
Jun 5, 2002·Genome Research·W James KentDavid Haussler
Apr 9, 2004·Proceedings of the National Academy of Sciences of the United States of America·Peter SaboJohn A Stamatoyannopoulos
Oct 23, 2004·Science·ENCODE Project Consortium
Jan 14, 2005·Bioinformatics·Shaun MahonyDaniel S Rokhsar
Oct 26, 2005·Nucleic Acids Research·Alexandre V MorozovEric D Siggia
May 5, 2007·Nucleic Acids Research·Shaun Mahony, Panayiotis V Benos
Mar 24, 2009·Nature Methods·Jay R HesselberthJohn A Stamatoyannopoulos
Jan 30, 2010·Bioinformatics·Aaron R Quinlan, Ira M Hall
Jul 20, 2010·Bioinformatics·W J KentDonna Karolchik
Feb 19, 2011·Bioinformatics·Charles E GrantWilliam Stafford Noble
Nov 11, 2011·Bioinformatics·Gabriel Cuellar-PartidaTimothy L Bailey
Apr 27, 2012·Genome Research·J Omar Yáñez-CunaAlexander Stark
Sep 8, 2012·Nature·Shane NephJohn A Stamatoyannopoulos
Nov 30, 2012·Nucleic Acids Research·Kate R RosenbloomW James Kent
Jan 29, 2013·Nature Biotechnology·Matthew T WeirauchTimothy R Hughes
Sep 23, 2014·Molecular Cell·Myong-Hee SungGordon L Hager
Oct 9, 2014·Nucleic Acids Research·Galip Gürkan YardımcıUwe Ohler
Nov 28, 2014·Nucleic Acids Research·Kate R RosenbloomW James Kent
Jan 7, 2015·Current Protocols in Molecular Biology·Jason D BuenrostroWilliam J Greenleaf
Mar 17, 2015·Proceedings of the National Academy of Sciences of the United States of America·Tianyin ZhouRemo Rohs
Feb 24, 2016·Nature Methods·Eduardo G GusmaoIvan G Costa
Feb 25, 2017·PLoS Computational Biology·Qian Qin, Jianxing Feng

Citations

Jan 18, 2018·Reproduction : the Official Journal of the Society for the Study of Fertility·Birgit Cabot, Ryan A Cabot
Nov 22, 2018·Briefings in Bioinformatics·Tianlei XuHao Wu
Apr 2, 2019·Current Pharmaceutical Design·Margarita A SazonovaIgor A Sobenin
Nov 15, 2019·F1000Research·Fatemeh Behjati ArdakaniMarcel H Schulz
Jan 12, 2019·Genome Biology·Jens KeilwagenJan Grau
Feb 5, 2020·Genome Biology·Feng YanNicholas C Wong

Datasets Mentioned

BETA
GM12878

Methods Mentioned

BETA
ChIP-seq

Related Concepts

Related Feeds

CZI Human Cell Atlas Seed Network

The aim of the Human Cell Atlas (HCA) is to build reference maps of all human cells in order to enhance our understanding of health and disease. The Seed Networks for the HCA project aims to bring together collaborators with different areas of expertise in order to facilitate the development of the HCA. Find the latest research from members of the HCA Seed Networks here.