Sep 17, 2016

Surrogate-assisted feature extraction for high-throughput phenotyping

Journal of the American Medical Informatics Association : JAMIA
Sheng YuTianxi Cai

Abstract

Phenotyping algorithms are capable of accurately identifying patients with specific phenotypes from within electronic medical records systems. However, developing phenotyping algorithms in a scalable way remains a challenge due to the extensive human resources required. This paper introduces a high-throughput unsupervised feature selection method, which improves the robustness and scalability of electronic medical record phenotyping without compromising its accuracy. The proposed Surrogate-Assisted Feature Extraction (SAFE) method selects candidate features from a pool of comprehensive medical concepts found in publicly available knowledge sources. The target phenotype's International Classification of Diseases, Ninth Revision and natural language processing counts, acting as noisy surrogates to the gold-standard labels, are used to create silver-standard labels. Candidate features highly predictive of the silver-standard labels are selected as the final features. Algorithms were trained to identify patients with coronary artery disease, rheumatoid arthritis, Crohn's disease, and ulcerative colitis using various numbers of labels to compare the performance of features selected by SAFE, a previously published automated feature e...Continue Reading

  • References35
  • Citations1

Citations

Mentioned in this Paper

Electronic Health Records
Size
Classification
Coronary Artery Disease
Coronary Arteriosclerosis
Rheumatoid Arthritis
Extraction
High Throughput Analysis
Natural Language Processing
Revision Procedure

Related Feeds

Bioinformatics in Biomedicine

Bioinformatics in biomedicine incorporates computer science, biology, chemistry, medicine, mathematics and statistics. Discover the latest research on bioinformatics in biomedicine here.