Recursive Random Forests Enable Better Predictive Performance and Model Interpretation than Variable Selection by LASSO

Journal of Chemical Information and Modeling
Xiang-Wei ZhuHui-Lin Ge

Abstract

Variable selection is of crucial significance in QSAR modeling since it increases the model predictive ability and reduces noise. The selection of the right variables is far more complicated than the development of predictive models. In this study, eight continuous and categorical data sets were employed to explore the applicability of two distinct variable selection methods random forests (RF) and least absolute shrinkage and selection operator (LASSO). Variable selection was performed: (1) by using recursive random forests to rule out a quarter of the least important descriptors at each iteration and (2) by using LASSO modeling with 10-fold inner cross-validation to tune its penalty λ for each data set. Along with regular statistical parameters of model performance, we proposed the highest pairwise correlation rate, average pairwise Pearson's correlation coefficient, and Tanimoto coefficient to evaluate the optimal by RF and LASSO in an extensive way. Results showed that variable selection could allow a tremendous reduction of noisy descriptors (at most 96% with RF method in this study) and apparently enhance model's predictive performance as well. Furthermore, random forests showed property of gathering important predictors ...Continue Reading

References

Feb 28, 1997·Statistics in Medicine·R Tibshirani
Feb 8, 2000·Journal of Chemical Information and Computer Sciences·W Zheng, A Tropsha
Oct 18, 2000·Combinatorial Chemistry & High Throughput Screening·L Xue, J Bajorath
Oct 18, 2001·Journal of Chemical Information and Computer Sciences·A Yasri, D Hartsough
Apr 19, 2002·Journal of Medicinal Chemistry·Julie E PenzottiPeter D J Grootenhuis
Nov 25, 2003·Journal of Chemical Information and Computer Sciences·Vladimir SvetnikBradley P Feuston
Jan 27, 2004·Journal of Chemical Information and Computer Sciences·Douglas M Hawkins
Feb 24, 2004·Journal of Computational Chemistry·Zheng YuanRohan D Teasdale
Oct 14, 2005·Toxicological Sciences : an Official Journal of the Society of Toxicology·Darrell R Boverhof, Timothy R Zacharewski
Jan 10, 2006·BMC Bioinformatics·Ramón Díaz-Uriarte, Sara Alvarez de Andrés
Sep 12, 2006·Toxicological Sciences : an Official Journal of the Society of Toxicology·David J DixRobert J Kavlock
Dec 29, 2007·Journal of Chemical Information and Modeling·Stephen R Johnson
Mar 4, 2008·Journal of Chemical Information and Modeling·Hao ZhuIgor V Tetko
Jun 17, 2008·Pharmaceutical Research·Liying ZhangAlexander Tropsha
Jul 16, 2008·BMC Bioinformatics·Carolin StroblAchim Zeileis
Dec 17, 2008·Current Topics in Medicinal Chemistry·Maykel Pérez GonzálezMarta Teijeira
Nov 1, 2011·Journal of Chemical Information and Modeling·Michael C Hutter
Oct 25, 2012·Nucleic Acids Research·Allan Peter DavisCarolyn J Mattingly
Apr 10, 2013·Pharmaceutical Research·Xiang-Wei ZhuAlexander Tropsha
Dec 7, 2013·BMC Systems Biology·Syeda HassanTommi Aho
Jul 12, 2010·Molecular Informatics·Alexander Tropsha
Oct 1, 2012·Molecular Informatics·S Stanley YoungMu Zhu
Feb 1, 2012·Molecular Informatics·Martin EklundLars Carlsson

❮ Previous
Next ❯

Citations

May 19, 2017·Toxicological Sciences : an Official Journal of the Society of Toxicology·Xiang-Wei Zhu, Shao-Jing Li
Oct 4, 2017·Toxicological Sciences : an Official Journal of the Society of Toxicology·Verena SchöningJürgen Drewe
Sep 14, 2019·Journal of Chemical Information and Modeling·Eric J MartinXin Liu

❮ Previous
Next ❯

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.