Effective Molecular Descriptors for Chemical Accuracy at DFT Cost: Fragmentation, Error-Cancellation, and Machine Learning

Journal of Chemical Theory and Computation
Eric M Collins, Krishnan Raghavachari

Abstract

Recent advances in theoretical thermochemistry have allowed the study of small organic and bio-organic molecules with high accuracy. However, applications to larger molecules are still impeded by the steep scaling problem of highly accurate quantum mechanical (QM) methods, forcing the use of approximate, more cost-effective methods at a greatly reduced accuracy. One of the most successful strategies to mitigate this error is the use of systematic error-cancellation schemes, in which highly accurate QM calculations can be performed on small portions of the molecule to construct corrections to an approximate method. Herein, we build on ideas from fragmentation and error-cancellation to introduce a new family of molecular descriptors for machine learning modeled after the Connectivity-Based Hierarchy (CBH) of generalized isodesmic reaction schemes. The best performing descriptor ML(CBH-2) is constructed from fragments preserving only the immediate connectivity of all heavy (non-H) atoms of a molecule along with overlapping regions of fragments in accordance with the inclusion-exclusion principle. Our proposed approach offers a simple, chemically intuitive grouping of atoms, tuned with an optimal amount of error-cancellation, and o...Continue Reading

References

Jul 23, 2004·The Journal of Chemical Physics·A Daniel BoeseJurgen Gauss
Jan 7, 2005·The Journal of Chemical Physics·Attila TajtiJohn F Stanton
Dec 15, 2006·The Journal of Physical Chemistry. a·Ahmed M El-NahasHenry J Curran
Mar 9, 2007·The Journal of Chemical Physics·Larry A CurtissKrishnan Raghavachari
Jul 4, 2007·Chemistry : a European Journal·Matthew D WodrichPaul von Ragué Schleyer
Feb 3, 2009·Journal of the American Chemical Society·Steven E WheelerWesley D Allen
Jun 10, 2009·Journal of the American Chemical Society·Lorenz C Blum, Jean-Louis Reymond
Mar 10, 2012·Physical Review Letters·Matthias RuppO Anatole von Lilienfeld
May 11, 2012·The Journal of Physical Chemistry. a·Raghunath O Ramabhadran, Krishnan Raghavachari
May 24, 2013·The Journal of Physical Chemistry. a·Raghunath O RamabhadranKrishnan Raghavachari
Nov 14, 2014·Accounts of Chemical Research·Raghunath O Ramabhadran, Krishnan Raghavachari
Jan 1, 2014·Scientific Data·Raghunathan RamakrishnanO Anatole von Lilienfeld
Jun 27, 2015·The Journal of Physical Chemistry Letters·Katja HansenAlexandre Tkatchenko
Jul 18, 2015·Science·M I Jordan, T M Mitchell
Nov 18, 2015·Journal of Chemical Theory and Computation·Raghunathan RamakrishnanO Anatole von Lilienfeld
Nov 21, 2015·Journal of Chemical Theory and Computation·Arkajyoti Sengupta, Krishnan Raghavachari
Jul 12, 2011·Journal of Chemical Theory and Computation·Raghunath O Ramabhadran, Krishnan Raghavachari
May 7, 2016·Nature·Paul RaccugliaAlexander J Norquist
Sep 8, 2016·Pharmaceutical Research·Sean Ekins
Mar 9, 2017·Journal of Computational Chemistry·Garrett B GohAbhinav Vishnu
Apr 4, 2017·Journal of Chemical Information and Modeling·Matthew RagozaDavid Ryan Koes
Jun 3, 2017·The Journal of Physical Chemistry Letters·Kun YaoJohn Parkhill
Dec 8, 2017·Angewandte Chemie·O Anatole von Lilienfeld
Dec 23, 2017·Journal of Chemical Information and Modeling·Didier Mathieu
Jan 26, 2018·Drug Discovery Today·Hongming ChenThomas Blaschke
Feb 2, 2018·The Journal of Physical Chemistry. a·Eric M CollinsKrishnan Raghavachari
May 2, 2018·Accounts of Chemical Research·Connor W ColeyKlavs F Jensen
May 12, 2018·Drug Discovery Today·Yu-Chen LoRuss B Altman
Jul 2, 2018·The Journal of Chemical Physics·K T SchüttK-R Müller
Jul 2, 2018·The Journal of Chemical Physics·Felix A FaberO Anatole von Lilienfeld
Jul 2, 2018·The Journal of Chemical Physics·Christopher R CollinsDavid J Yaron
Jul 27, 2018·Nature·Keith T ButlerAron Walsh
Dec 6, 2018·Journal of Chemical Theory and Computation·Peter ZaspelO Anatole von Lilienfeld
Jun 14, 2019·Journal of Chemical Information and Modeling·Adam C Mater, Michelle L Coote
Jun 28, 2019·The Journal of Physical Chemistry. a·Colin A GrambowWilliam H Green
Sep 7, 2019·Chemical Science·Badri NarayananLarry A Curtiss
Oct 10, 2019·Journal of Chemical Theory and Computation·Bishnu Thapa, Krishnan Raghavachari
Feb 14, 2020·Physical Chemistry Chemical Physics : PCCP·Sarah MaierKrishnan Raghavachari

❮ Previous
Next ❯

Citations

Apr 21, 2021·Angewandte Chemie·Martyna MoskalBartosz A Grzybowski
Aug 4, 2021·The Journal of Physical Chemistry. a·Eric M Collins, Krishnan Raghavachari

❮ Previous
Next ❯

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.