Two excited-state datasets for quantum chemical UV-vis spectra of organic molecules

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

We present two open-source datasets that provide time-dependent density-functional tight-binding (TD-DFTB) electronic excitation spectra of organic molecules. These datasets represent predictions of UV-vis absorption spectra performed on optimized geometries of the molecules in their electronic ground state. The GDB-9-Ex dataset contains a subset of 96,766 organic molecules from the original open-source GDB-9 dataset. The ORNL_AISD-Ex dataset consists of 10,502,904 organic molecules that contain between 5 and 71 non-hydrogen atoms. The data reveals the close correlation between the magnitude of the gaps between the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO), and the excitation energy of the lowest singlet excited state energies quantitatively. The chemical variability of the large number of molecules was examined with a topological fingerprint estimation based on extended-connectivity fingerprints (ECFPs) followed by uniform manifold approximation and projection (UMAP) for dimension reduction. Both datasets were generated using the DFTB+ software on the “Andes” cluster of the Oak Ridge Leadership Computing Facility (OLCF).

Original languageEnglish
Article number546
JournalScientific Data
Volume10
Issue number1
DOIs
StatePublished - Dec 2023

Funding

The authors thank Dr. Vladimir Protopopescu for his valuable feedback in the preparation of this manuscript. This work was supported in part by the Office of Science of the Department of Energy, the Laboratory Directed Research and Development (LDRD) Program of Oak Ridge National Laboratory, Office of Advanced Scientific Computing Research, and the Scientific Discovery through Advanced Computing (SciDAC) program. This research is sponsored by the Artificial Intelligence Initiative as part of the Laboratory Directed Research and Development (LDRD) Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. An award of computer time was provided by the OLCF Director’s Discretion Project program using the OLCF award MAT250. This work used resources of the Oak Ridge Leadership Computing Facility and of the Edge Computing program at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doepublic-access-plan ). The authors thank Dr. Vladimir Protopopescu for his valuable feedback in the preparation of this manuscript. This work was supported in part by the Office of Science of the Department of Energy, the Laboratory Directed Research and Development (LDRD) Program of Oak Ridge National Laboratory, Office of Advanced Scientific Computing Research, and the Scientific Discovery through Advanced Computing (SciDAC) program. This research is sponsored by the Artificial Intelligence Initiative as part of the Laboratory Directed Research and Development (LDRD) Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. An award of computer time was provided by the OLCF Director’s Discretion Project program using the OLCF award MAT250. This work used resources of the Oak Ridge Leadership Computing Facility and of the Edge Computing program at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doepublic-access-plan).

Fingerprint

Dive into the research topics of 'Two excited-state datasets for quantum chemical UV-vis spectra of organic molecules'. Together they form a unique fingerprint.

Cite this