Extending information retrieval methods to personalized genomic-based studies of disease

Cancer Informatics
Shuyun YeChristina Kendziorski

Abstract

Genomic-based studies of disease now involve diverse types of data collected on large groups of patients. A major challenge facing statistical scientists is how best to combine the data, extract important features, and comprehensively characterize the ways in which they affect an individual's disease course and likelihood of response to treatment. We have developed a survival-supervised latent Dirichlet allocation (survLDA) modeling framework to address these challenges. Latent Dirichlet allocation (LDA) models have proven extremely effective at identifying themes common across large collections of text, but applications to genomics have been limited. Our framework extends LDA to the genome by considering each patient as a "document" with "text" detailing his/her clinical events and genomic state. We then further extend the framework to allow for supervision by a time-to-event response. The model enables the efficient identification of collections of clinical and genomic features that co-occur within patient subgroups, and then characterizes each patient by those features. An application of survLDA to The Cancer Genome Atlas ovarian project identifies informative patient subgroups showing differential response to treatment, and...Continue Reading

References

Nov 1, 1988·Statistics in Medicine·O O Aalen
Jan 15, 2002·Genetic Epidemiology·D V ZaykinB S Weir
Jan 23, 2003·Genome Biology·Marcel Dettling, Peter Bühlmann
May 27, 2004·Clinical Cancer Research : an Official Journal of the American Association for Cancer Research·Stephen B FoxAlison H Banham
Feb 24, 2007·BMC Bioinformatics·Shuangge MaJian Huang
Oct 2, 2007·Nature Methods·Michael J Zilliox, Rafael A Irizarry
Aug 14, 2008·Clinical Cancer Research : an Official Journal of the American Association for Cancer Research·Richard W TothillDavid D L Bowtell
Feb 3, 2009·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Xi Chen, Lily Wang
Sep 24, 2009·Journal of Computational Biology : a Journal of Computational Molecular Cell Biology·Zhenqiu LiuMing Tan
Jul 7, 2011·Statistics & Probability Letters·Xi ChenHemant Ishwaran
Apr 19, 2013·Gynecologic and Obstetric Investigation·Jae Hong NoYong-Beom Kim

❮ Previous
Next ❯

Software Mentioned

survLDA
LDA

Related Concepts

Related Feeds

Cancer Genomics (Keystone)

Cancer genomics approaches employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest research using such technologies in this feed.