Early detection of type 2 diabetes mellitus using machine learning-based prediction models.

Scientific Reports
Leon KopitarGregor Stiglic

Abstract

Most screening tests for T2DM in use today were developed using multivariate regression methods that are often further simplified to allow transformation into a scoring formula. The increasing volume of electronically collected data opened the opportunity to develop more complex, accurate prediction models that can be continuously updated using machine learning approaches. This study compares machine learning-based prediction models (i.e. Glmnet, RF, XGBoost, LightGBM) to commonly used regression models for prediction of undiagnosed T2DM. The performance in prediction of fasting plasma glucose level was measured using 100 bootstrap iterations in different subsets of data simulating new incoming data in 6-month batches. With 6 months of data available, simple regression model performed with the lowest average RMSE of 0.838, followed by RF (0.842), LightGBM (0.846), Glmnet (0.859) and XGBoost (0.881). When more data were added, Glmnet improved with the highest rate (+ 3.4%). The highest level of variable selection stability over time was observed with LightGBM models. Our results show no clinically relevant improvement when more sophisticated prediction models were used. Since higher stability of selected variables over time cont...Continue Reading

References

May 30, 2007·Archives of Internal Medicine·Peter W F WilsonRalph B D'Agostino
Mar 10, 2015·Computational and Structural Biotechnology Journal·Konstantina KourouDimitrios I Fotiadis
Jul 21, 2015·International Journal of Occupational Medicine and Environmental Health·Godelieve Johanna Maurice Vandersmissen, Lode Godderis
Aug 26, 2015·Journal of Medical Systems·Nino FijackoGregor Stiglic
Sep 28, 2015·Technology and Health Care : Official Journal of the European Society for Engineering and Medicine·Mitra MontazeriAmin Beigzadeh
Nov 14, 2015·Diabetes & Metabolism Journal·Klaus G Parhofer
May 18, 2016·BMJ : British Medical Journal·Johanna A A G DamenKarel G M Moons
Dec 13, 2016·Diabetes Research and Clinical Practice·David Cavan
Feb 1, 2017·Computational and Structural Biotechnology Journal·Ioannis KavakiotisIoanna Chouvarda
Jan 1, 2016·Philosophical Transactions. Series A, Mathematical, Physical, and Engineering Sciences·Sebastian Porsdam MannBarbara J Sahakian
Jul 27, 2017·São Paulo Medical Journal = Revista Paulista De Medicina·André Rodrigues OliveraBruce Bartholow Duncan
Nov 15, 2017·International Journal of Medical Informatics·Chip M LynchHermann B Frieboes
Nov 22, 2018·Frontiers in Genetics·Quan ZouHua Tang
Feb 15, 2019·Journal of Clinical Epidemiology·Evangelia ChristodoulouBen Van Calster
Feb 27, 2019·Current Diabetes Reports·Anastasia-Stefania AlexopoulosJohn R Guyton

❮ Previous
Next ❯

Citations

May 1, 2021·Archives of Oral Biology·Yun-Kun LiuXue-Dong Zhou
Oct 13, 2021·Scientific Reports·Yazeed ZoabiNoam Shomron

❮ Previous
Next ❯

Methods Mentioned

BETA
LightGBM

Software Mentioned

eXtreme Gradient Boosting
Random
AUPRC
Glmnet
lm
XGBoost
LightGBM
RF
Forest
Random Forest

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.