Chemical diversity in molecular orbital energy predictions with kernel ridge regression

The Journal of Chemical Physics
Annika StukePatrick Rinke

Abstract

Instant machine learning predictions of molecular properties are desirable for materials design, but the predictive power of the methodology is mainly tested on well-known benchmark datasets. Here, we investigate the performance of machine learning with kernel ridge regression (KRR) for the prediction of molecular orbital energies on three large datasets: the standard QM9 small organic molecules set, amino acid and dipeptide conformers, and organic crystal-forming molecules extracted from the Cambridge Structural Database. We focus on the prediction of highest occupied molecular orbital (HOMO) energies, computed at the density-functional level of theory. Two different representations that encode the molecular structure are compared: the Coulomb matrix (CM) and the many-body tensor representation (MBTR). We find that KRR performance depends significantly on the chemistry of the underlying dataset and that the MBTR is superior to the CM, predicting HOMO energies with a mean absolute error as low as 0.09 eV. To demonstrate the power of our machine learning method, we apply our model to structures of 10k previously unseen molecules. We gain instant energy predictions that allow us to identify interesting molecules for future applic...Continue Reading

References

Oct 28, 1996·Physical Review Letters·J P PerdewM Ernzerhof
Mar 5, 2009·Physical Review Letters·Alexandre Tkatchenko, Matthias Scheffler
Jun 10, 2009·Journal of the American Chemical Society·Lorenz C Blum, Jean-Louis Reymond
Mar 10, 2012·Physical Review Letters·Matthias RuppO Anatole von Lilienfeld
Aug 17, 2012·Nature·Steven Chu, Arun Majumdar
Oct 24, 2012·Journal of Chemical Information and Modeling·Lars RuddigkeitJean-Louis Reymond
Jan 10, 2014·Nature·Brian HuskinsonMichael J Aziz
Jan 31, 2015·Journal of Chemical Information and Modeling·Junshui MaVladimir Svetnik
Jan 1, 2014·Scientific Data·Raghunathan RamakrishnanO Anatole von Lilienfeld
Jun 27, 2015·The Journal of Physical Chemistry Letters·Katja HansenAlexandre Tkatchenko
Jul 18, 2015·Science·M I Jordan, T M Mitchell
Sep 4, 2015·The Journal of Chemical Physics·Raghunathan RamakrishnanO Anatole von Lilienfeld
Nov 18, 2015·Journal of Chemical Theory and Computation·Raghunathan RamakrishnanO Anatole von Lilienfeld
Nov 18, 2015·Journal of Chemical Theory and Computation·Tristan BereauO Anatole von Lilienfeld
Aug 13, 2013·Journal of Chemical Theory and Computation·Katja HansenKlaus-Robert Müller
Apr 7, 2016·Acta Crystallographica Section B, Structural Science, Crystal Engineering and Materials·Colin R GroomSuzanna C Ward
Apr 23, 2016·Physical Chemistry Chemical Physics : PCCP·Sandip DeMichele Ceriotti
Sep 24, 2016·The Journal of Physical Chemistry Letters·Christoph SchoberHarald Oberhofer
Nov 3, 2016·The Journal of Chemical Physics·Bing Huang, O Anatole von Lilienfeld
Dec 30, 2016·Journal of Chemical Information and Modeling·Florbela PereiraJoao Aires-de-Sousa
Feb 17, 2017·Journal of Cheminformatics·Sandip DeMichele Ceriotti
Sep 20, 2017·Journal of Chemical Theory and Computation·Felix A FaberO Anatole von Lilienfeld
Dec 8, 2017·Angewandte Chemie·O Anatole von Lilienfeld
Dec 16, 2017·Science Advances·Albert P BartókMichele Ceriotti
Mar 14, 2018·ACS Central Science·Rafael Gómez-BombarelliAlán Aspuru-Guzik
Apr 10, 2018·Chemical Science·Zhenqin WuVijay Pande
Jul 2, 2018·The Journal of Chemical Physics·Nicholas LubbersKipton Barros
Jul 2, 2018·The Journal of Chemical Physics·Matthias RuppKieron Burke
Jul 2, 2018·The Journal of Chemical Physics·K T SchüttK-R Müller
Jul 2, 2018·The Journal of Chemical Physics·Felix A FaberO Anatole von Lilienfeld
Jul 2, 2018·The Journal of Chemical Physics·Christopher R CollinsDavid J Yaron
Aug 24, 2018·Journal of Cheminformatics·Florbela Pereira, João Aires-de-Sousa
Oct 13, 2018·Chemical Science·Benjamin MeyerClémence Corminboeuf
Mar 9, 2019·Journal of Molecular Modeling·Christian KunkelKarsten Reuter
May 9, 2019·Advanced Science·Kunal GhoshPatrick Rinke

❮ Previous
Next ❯

Citations

Feb 20, 2020·Scientific Data·Annika StukeHarald Oberhofer
Sep 17, 2020·The Journal of Chemical Physics·Kaycee LowEkaterina I Izgorodina
Nov 16, 2019·Advanced Science·Lauri HimanenPatrick Rinke
Nov 1, 2020·Nature Communications·Sina StockerJohannes T Margraf
Nov 20, 2020·Chemical Reviews·Julia Westermayr, Philipp Marquetand

❮ Previous
Next ❯

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Epigenetics Insights from Twin Studies

Find the latest research on epigenetics and twin studies here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Regulation of Vocal-Motor Plasticity

Dopaminergic projections to the basal ganglia and nucleus accumbens shape the learning and plasticity of motivated behaviors across species including the regulation of vocal-motor plasticity and performance in songbirds. Discover the latest research on the regulation of vocal-motor plasticity here.

Myocardial Stunning

Myocardial stunning is a mechanical dysfunction that persists after reperfusion of previously ischemic tissue in the absence of irreversible damage including myocardial necrosis. Here is the latest research.