CarcinoPred-EL: Novel models for predicting the carcinogenicity of chemicals using molecular fingerprints and ensemble learning methods

Scientific Reports
Li ZhangHongsheng Liu

Abstract

Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in ...Continue Reading

References

Dec 1, 1975·Proceedings of the National Academy of Sciences of the United States of America·J McCannB N Ames
Jul 1, 1995·Environmental Health Perspectives·V A FungJ Huff
Mar 26, 2003·Journal of Chemical Information and Computer Sciences·Zhigang ZhouTong Gu
Jul 23, 2003·Journal of Chemical Information and Computer Sciences·Ling XueJürgen Bajorath
Apr 1, 2005·Toxicological Sciences : an Official Journal of the Society of Toxicology·Lois Swirsky GoldGeorganne Backman Garfinkel
Dec 31, 2005·Nucleic Acids Research·David S WishartJennifer Woolsey
May 5, 2010·Expert Opinion on Drug Metabolism & Toxicology·Romualdo BenigniAlessandro Giuliani
Aug 4, 2010·Chemistry Central Journal·Natalja FjodorovaEmilio Benfenati
Jul 21, 2011·Journal of Pharmacology & Pharmacotherapeutics·S Parasuraman
Sep 8, 2011·Journal of Computer-aided Molecular Design·Chin Yee LiewChun Wei Yap
Oct 5, 2011·Mutation Research·Giovanni BrambillaAntonietta Martelli
Jun 16, 2012·Veterinary Pathology·A C Jacobs, K P Hatfield
Aug 11, 2012·Journal of Chemical Information and Modeling·Iurii SushkoIgor V Tetko
Oct 25, 2012·Journal of Chemical Information and Modeling·Feixiong ChengYun Tang
Apr 13, 2013·Chemical Research in Toxicology·Min ZhongQipeng Yuan
Jun 14, 2013·Frontiers in Pharmacology·Andreas MaunzChristoph Helma
Jul 17, 2013·Toxicology and Applied Pharmacology·Kunwar P SinghPremanjali Rai
Jan 24, 2014·Drug Discovery Today·Matthew D Segall, Chris Barber
Feb 12, 2014·European Journal of Pharmaceutical Sciences : Official Journal of the European Federation for Pharmaceutical Sciences·Alla P Toropova, Andrey A Toropov
Jun 29, 2014·Expert Opinion on Drug Metabolism & Toxicology·Romualdo Benigni
Aug 19, 2014·Methods : a Companion to Methods in Enzymology·Adrià Cereto-MassaguéGerard Pujadas
Jun 13, 2015·Chemosphere·Xiuchao WuJingtian Hu
Aug 19, 2015·Briefings in Bioinformatics·Xing ChenYongdong Zhang
Mar 18, 2016·Journal of Environmental Science and Health. Part C, Environmental Carcinogenesis & Ecotoxicology Reviews·Azadi GolbamakiGiuseppina Gini
Apr 12, 2016·Wiley Interdisciplinary Reviews. Computational Molecular Science·Arwa B Raies, Vladimir B Bajic
Jun 28, 2016·Briefings in Bioinformatics·Xing ChenZhu-Hong You
Jul 16, 2016·PLoS Computational Biology·Xing ChenGuiying Yan
Oct 30, 2016·Food and Chemical Toxicology : an International Journal Published for the British Industrial Biological Research Association·Hui ZhangCheng Peng
Dec 14, 2016·Journal of Chemical Information and Modeling·Robert P SheridanEric M Gifford

❮ Previous
Next ❯

Citations

May 23, 2018·Toxicological Sciences : an Official Journal of the Society of Toxicology·Haixin AiHongsheng Liu
Oct 27, 2017·Bioinformatics·Alexey LaguninJonathan Wren
Oct 19, 2017·Briefings in Bioinformatics·Xing ChenZhu-Hong You
Jan 7, 2020·Journal of Biomolecular Structure & Dynamics·Taj MohammadImtaiyaz Hassan
Oct 28, 2019·BMC Bioinformatics·Sunyoung KwonSungroh Yoon
Dec 19, 2019·Environmental Health Perspectives·Edo D PellizzariUNKNOWN (Environmental influences on Child Health Outcomes)
Apr 25, 2018·Molecular Genetics and Genomics : MGG·Li-Hong PengXing Chen
Jun 15, 2018·Scientific Reports·Daniel C EltonPeter W Chung
Jun 6, 2018·Frontiers in Microbiology·Jidong ZhangWei Chen
Jul 7, 2018·Scientific Reports·Ahmed S A MadyZaneta Nikolovska-Coleska
Dec 14, 2018·International Journal of Molecular Sciences·Rodrigo OchoaRubén E Varela-M
Sep 10, 2019·Journal of Biomolecular Structure & Dynamics·Saman FatimaMd Imtaiyaz Hassan
May 19, 2020·Toxicology and Environmental Health Sciences·Kyung-Taek Rim
Mar 9, 2018·Frontiers in Physiology·Xing ChenJian-Qiang Li
Aug 16, 2018·Frontiers in Immunology·Balachandran ManavalanGwang Lee
Dec 19, 2019·Molecules : a Journal of Synthetic Chemistry and Natural Product Chemistry·Taj MohammadImtaiyaz Hassan
Sep 26, 2019·Frontiers in Bioengineering and Biotechnology·Diana Larisa RomanAdriana Isvoran
Jan 25, 2019·Journal of Translational Medicine·Zhidai LiuLin Zou
Nov 25, 2020·Journal of Biomolecular Structure & Dynamics·Moyad ShahwanAnas Shamsi
Dec 28, 2018·Journal of Chemical Information and Modeling·Sergey SosninMaxim V Fedorov
Dec 29, 2020·Chemical Research in Toxicology·Marcus W H WangTimothy E H Allen
Dec 29, 2020·Chemical Research in Toxicology·Ting LiShraddha Thakkar
Jan 2, 2021·Molecules : a Journal of Synthetic Chemistry and Natural Product Chemistry·Cosimo TomaEmilio Benfenati
May 8, 2021·Journal of Medicinal Chemistry·Zhenxing WuTingjun Hou
Jun 11, 2021·Molecular Diversity·Anita RáczKároly Héberger
Feb 9, 2019·Journal of Chemical Information and Modeling·Ingrid GrenetFrédéric Dayan

❮ Previous
Next ❯

Methods Mentioned

BETA
ISS

Software Mentioned

lazar
Ensemble SVM
kernlab
CarcinoPred
randomForest
R package caret
Ensemble XGBoost
VEGA
Descriptor
admetSAR

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.