Computational Workflow for Accelerated Molecular Design Using Quantum Chemical Simulations and Deep Learning Models

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Efficient methods for searching the chemical space of molecular compounds are needed to automate and accelerate the design of new functional molecules such as pharmaceuticals. Given the high cost in both resources and time for experimental efforts, computational approaches play a key role in guiding the selection of promising molecules for further investigation. Here, we construct a workflow to accelerate design by combining approximate quantum chemical methods [i.e. density-functional tight-binding (DFTB)], a graph convolutional neural network (GCNN) surrogate model for chemical property prediction, and a masked language model (MLM) for molecule generation. Property data from the DFTB calculations are used to train the surrogate model; the surrogate model is used to score candidates generated by the MLM. The surrogate reduces computation time by orders of magnitude compared to the DFTB calculations, enabling an increased search of chemical space. Furthermore, the MLM generates a diverse set of chemical modifications based on pre-training from a large compound library. We utilize the workflow to search for near-infrared photoactive molecules by minimizing the predicted HOMO-LUMO gap as the target property. Our results show that the workflow can generate optimized molecules outside of the original training set, which suggests that iterations of the workflow could be useful for searching vast chemical spaces in a wide range of design problems.

Original languageEnglish
Title of host publicationAccelerating Science and Engineering Discoveries Through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation - 22nd Smoky Mountains Computational Sciences and Engineering Conference, SMC 2022, Revised Selected Papers
EditorsKothe Doug, Geist Al, Swaroop Pophale, Hong Liu, Suzanne Parete-Koon
PublisherSpringer Science and Business Media Deutschland GmbH
Pages3-19
Number of pages17
ISBN (Print)9783031236051
DOIs
StatePublished - 2022
EventSmoky Mountains Computational Sciences and Engineering Conference, SMC 2022 - Virtual, Online
Duration: Aug 24 2022Aug 25 2022

Publication series

NameCommunications in Computer and Information Science
Volume1690 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

ConferenceSmoky Mountains Computational Sciences and Engineering Conference, SMC 2022
CityVirtual, Online
Period08/24/2208/25/22

Funding

Acknowledgements. We thank Pilsun Yoo for fruitful discussions on the synthesiz-ability score. This work was supported in part by the Office of Science of the Department of Energy and by the Laboratory Directed Research and Development (LDRD) Program of Oak Ridge National Laboratory. This research is sponsored by the Artificial Intelligence Initiative as part of the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. The research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This work used resources of the Oak Ridge Leadership Computing Facility, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

Fingerprint

Dive into the research topics of 'Computational Workflow for Accelerated Molecular Design Using Quantum Chemical Simulations and Deep Learning Models'. Together they form a unique fingerprint.

Cite this