Clustering and variable selection in the presence of mixed variable types and missing data

Statistics in Medicine
C B StorlieJ D Port

Abstract

We consider the problem of model-based clustering in the presence of many correlated, mixed continuous, and discrete variables, some of which may have missing values. Discrete variables are treated with a latent continuous variable approach, and the Dirichlet process is used to construct a mixture model with an unknown number of components. Variable selection is also performed to identify the variables that are most influential for determining cluster membership. The work is motivated by the need to cluster patients thought to potentially have autism spectrum disorder on the basis of many cognitive and/or behavioral test scores. There are a modest number of patients (486) in the data set along with many (55) test score variables (many of which are discrete valued and/or missing). The goal of the work is to (1) cluster these patients into similar groups to help identify those with similar clinical presentation and (2) identify a sparse subset of tests that inform the clusters in order to eliminate unnecessary testing. The proposed approach compares very favorably with other methods via simulation of problems of this type. The results of the autism spectrum disorder analysis suggested 3 clusters to be most likely, while only 4 te...Continue Reading

References

Sep 1, 1991·Statistics in Medicine·E Lesaffre, G Molenberghs
Feb 13, 2009·Biometrics·Cathy MaugisMarie-Laure Martin-Magniette
Apr 28, 2009·Computational Statistics & Data Analysis·Xiao ZhangThomas R Belin
Oct 1, 2009·Technometrics : a Journal of Statistics for the Physical, Chemical, and Engineering Sciences·Brian J ReichHoward D Bondell
Sep 3, 2010·Journal of the American Statistical Association·Daniela M Witten, Robert Tibshirani
Nov 1, 2011·Bioinformatics·Daniel J Stekhoven, Peter Bühlmann
Dec 1, 2009·Journal of the American Statistical Association·Yeonseung Chung, David B Dunson
Mar 1, 2012·Journal of the American Statistical Association·Anirban Bhattacharya, David B Dunson
Sep 21, 2016·Lancet Neurology·Ziv Gan-Or, Guy A Rouleau

❮ Previous
Next ❯

Related Concepts

Related Feeds

Autism

Autism spectrum disorder is associated with challenges with social skills, repetitive behaviors, and often accompanied by sensory sensitivities and medical issues. Here is the latest research on autism.