Many-Body Descriptors for Predicting Molecular Properties with Machine Learning: Analysis of Pairwise and Three-Body Interactions in Molecules

Journal of Chemical Theory and Computation
Wiktor PronobisKlaus-Robert Müller

Abstract

Machine learning (ML) based prediction of molecular properties across chemical compound space is an important and alternative approach to efficiently estimate the solutions of highly complex many-electron problems in chemistry and physics. Statistical methods represent molecules as descriptors that should encode molecular symmetries and interactions between atoms. Many such descriptors have been proposed; all of them have advantages and limitations. Here, we propose a set of general two-body and three-body interaction descriptors which are invariant to translation, rotation, and atomic indexing. By adapting the successfully used kernel ridge regression methods of machine learning, we evaluate our descriptors on predicting several properties of small organic molecules calculated using density-functional theory. We use two data sets. The GDB-7 set contains 6868 molecules with up to 7 heavy atoms of type CNO. The GDB-9 set is composed of 131722 molecules with up to 9 heavy atoms containing CNO. When trained on 5000 random molecules, our best model achieves an accuracy of 0.8 kcal/mol (on the remaining 1868 molecules of GDB-7) and 1.5 kcal/mol (on the remaining 126722 molecules of GDB-9) respectively. Applying a linear regression m...Continue Reading

References

Apr 6, 2005·Journal of Chemical Information and Modeling·Klaus-Robert MüllerNikolaus Heinrich
May 16, 2007·Physical Review Letters·Jörg Behler, Michele Parrinello
Feb 5, 2008·IEEE Transactions on Neural Networks·K R MüllerB Schölkopf
Jun 10, 2009·Journal of the American Chemical Society·Lorenz C Blum, Jean-Louis Reymond
Apr 2, 2010·Nature Reviews. Drug Discovery·Gisbert Schneider
May 21, 2010·Physical Review Letters·Albert P BartókGábor Csányi
Mar 10, 2012·Physical Review Letters·Matthias RuppO Anatole von Lilienfeld
Sep 26, 2012·Physical Review Letters·Alexandre TkatchenkoMatthias Scheffler
Sep 26, 2012·Physical Review Letters·John C SnyderKieron Burke
Oct 24, 2012·Journal of Chemical Information and Modeling·Lars RuddigkeitJean-Louis Reymond
Jan 1, 2014·Scientific Data·Raghunathan RamakrishnanO Anatole von Lilienfeld
Jun 27, 2015·The Journal of Physical Chemistry Letters·Katja HansenAlexandre Tkatchenko
Nov 18, 2015·Journal of Chemical Theory and Computation·Frank Noé, Cecilia Clementi
Aug 13, 2013·Journal of Chemical Theory and Computation·Katja HansenKlaus-Robert Müller
Jan 2, 2016·Neuropharmacology·Francisco J Romero-DuránHumberto González-Díaz
Nov 3, 2016·The Journal of Chemical Physics·Bing Huang, O Anatole von Lilienfeld
Jan 9, 2017·The Journal of Chemical Physics·Kun YaoJohn Parkhill
Jan 10, 2017·Nature Communications·Kristof T SchüttAlexandre Tkatchenko
Mar 4, 2017·Science·Matej MoravčíkMichael Bowling
May 17, 2017·Science Advances·Stefan ChmielaKlaus-Robert Müller
Sep 20, 2017·Journal of Chemical Theory and Computation·Felix A FaberO Anatole von Lilienfeld
Oct 13, 2017·Nature Communications·Felix BrockherdeKlaus-Robert Müller
Oct 21, 2017·Nature·David SilverDemis Hassabis
Nov 18, 2017·Chemical Science·Michael GasteggerPhilipp Marquetand
Dec 16, 2017·Science Advances·Albert P BartókMichele Ceriotti
Jan 4, 2018·Nature Communications·Andreas MardtFrank Noé
Jul 2, 2018·The Journal of Chemical Physics·Nicholas LubbersKipton Barros
Jul 2, 2018·The Journal of Chemical Physics·K T SchüttK-R Müller
Jul 2, 2018·The Journal of Chemical Physics·Felix A FaberO Anatole von Lilienfeld

❮ Previous
Next ❯

Citations

Oct 3, 2020·The Journal of Chemical Physics·Huziel E SaucedaAlexandre Tkatchenko
Sep 27, 2018·Nature Communications·Stefan ChmielaAlexandre Tkatchenko
Feb 3, 2020·The Journal of Chemical Physics·Anders S ChristensenO Anatole von Lilienfeld
Feb 17, 2019·The Journal of Chemical Physics·Anders S ChristensenO Anatole von Lilienfeld
Nov 22, 2019·Physical Chemistry Chemical Physics : PCCP·Trevor A Profitt, Jason K Pearson
Sep 7, 2020·The Journal of Chemical Physics·Mariia Karabin, Danny Perez
Oct 18, 2020·Nature Communications·Mihail BogojeskiKieron Burke
Dec 10, 2020·The Journal of Chemical Physics·Janus J Eriksen
Dec 10, 2020·The Journal of Chemical Physics·Chun-I WangChao-Ping Hsu
Mar 9, 2021·Journal of Chemical Information and Modeling·Jianing LuYingkai Zhang
Mar 10, 2021·The Journal of Chemical Physics·Valentin Vassilev-GalindoAlexandre Tkatchenko
May 20, 2021·Journal of Computational Chemistry·Bastien CasierDario Rocca
Jun 4, 2021·The Journal of Physical Chemistry. a·Maarten R DobbelaereKevin M Van Geem
Sep 23, 2020·The Journal of Physical Chemistry. a·Marc Philipp BahlkeCarmen Herrmann
Aug 14, 2021·Chemical Reviews·Bing Huang, O Anatole von Lilienfeld
Nov 6, 2018·Journal of the American Chemical Society·Ganna Gryn'ovaClémence Corminboeuf
Feb 25, 2020·Journal of Chemical Information and Modeling·Joshua A KammeraadPaul M Zimmerman
Nov 7, 2019·ACS Applied Materials & Interfaces·Gabriel R SchlederAdalberto Fazzio
Aug 14, 2020·The Journal of Physical Chemistry Letters·Martin StöhrAlexandre Tkatchenko

❮ Previous
Next ❯

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.