Machine learning for characterizing risk of type 2 diabetes mellitus in a rural Chinese population: the Henan Rural Cohort Study

Scientific Reports
Liying ZhangZhenfei Wang

Abstract

With the development of data mining, machine learning offers opportunities to improve discrimination by analyzing complex interactions among massive variables. To test the ability of machine learning algorithms for predicting risk of type 2 diabetes mellitus (T2DM) in a rural Chinese population, we focus on a total of 36,652 eligible participants from the Henan Rural Cohort Study. Risk assessment models for T2DM were developed using six machine learning algorithms, including logistic regression (LR), classification and regression tree (CART), artificial neural networks (ANN), support vector machine (SVM), random forest (RF) and gradient boosting machine (GBM). The model performance was measured in an area under the receiver operating characteristic curve, sensitivity, specificity, positive predictive value, negative predictive value and area under precision recall curve. The importance of variables was identified based on each classifier and the shapley additive explanations approach. Using all available variables, all models for predicting risk of T2DM demonstrated strong predictive performance, with AUCs ranging between 0.811 and 0.872 using laboratory data and from 0.767 to 0.817 without laboratory data. Among them, the GBM ...Continue Reading

References

Mar 1, 2003·Diabetes Care·Jaana Lindström, Jaakko Tuomilehto
Oct 28, 2003·Diabetes Care·David M Eddy, Leonard Schlessinger
Nov 25, 2003·Journal of Chemical Information and Computer Sciences·Vladimir SvetnikBradley P Feuston
Dec 13, 2006·Nature Biotechnology·William S Noble
Jul 31, 2007·Bioinformatics·S HenglT Maiwald
Jan 6, 2009·Diabetes Care·UNKNOWN American Diabetes Association
Jan 1, 2013·Journal of Diabetes and Its Complications·Vincenzo LaganiIoannis Tsamardinos
Sep 5, 2013·JAMA : the Journal of the American Medical Association·Yu XuUNKNOWN 2010 China Noncommunicable Disease Surveillance Group
Mar 14, 2014·The Lancet. Diabetes & Endocrinology·Andre Pascal KengneNicholas J Wareham
Aug 3, 2014·Diabetes Research and Clinical Practice·Azra RamezankhaniFarzad Hadaegh
Nov 18, 2015·Circulation·Rahul C Deo
Sep 30, 2016·The New England Journal of Medicine·Ziad Obermeyer, Ezekiel J Emanuel
Dec 7, 2016·International Journal of Medical Informatics·Tao ZhengYou Chen
Feb 16, 2017·Calcified Tissue International·Christian KrusePeter Vestergaard
May 13, 2017·Journal of Diabetes Science and Technology·Arianna DagliatiRiccardo Bellazzi
Aug 11, 2017·Circulation Research·Bharath Ambale-VenkateshJoão A C Lima
Oct 22, 2018·International Journal of Medical Informatics·Amir Talaei-Khoei, James M Wilson
Nov 22, 2018·Frontiers in Genetics·Quan ZouHua Tang
Mar 20, 2019·Proceedings of the IEEE·Theodora S BrisimiIoannis Ch Paschalidis
Mar 28, 2019·International Journal of Epidemiology·Xiaotian LiuChongjian Wang
Nov 7, 2019·BMC Medical Informatics and Decision Making·An DinhSomya D Mohanty

❮ Previous
Next ❯

Software Mentioned

Python
SPSS
sklearn

Related Concepts

Related Feeds

CV Disorders & Type 2 Diabetes

This feed focuses on the association of cardiovascular diseases in patients with type 2 diabetes.