Learning together: Towards foundation models for machine learning interatomic potentials with meta-learning

Alice E.A. Allen, Nicholas Lubbers, Sakib Matin, Justin Smith, Richard Messerly, Sergei Tretiak, Kipton Barros

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

The development of machine learning models has led to an abundance of datasets containing quantum mechanical (QM) calculations for molecular and material systems. However, traditional training methods for machine learning models are unable to leverage the plethora of data available as they require that each dataset be generated using the same QM method. Taking machine learning interatomic potentials (MLIPs) as an example, we show that meta-learning techniques, a recent advancement from the machine learning community, can be used to fit multiple levels of QM theory in the same training process. Meta-learning changes the training procedure to learn a representation that can be easily re-trained to new tasks with small amounts of data. We then demonstrate that meta-learning enables simultaneously training to multiple large organic molecule datasets. As a proof of concept, we examine the performance of a MLIP refit to a small drug-like molecule and show that pre-training potentials to multiple levels of theory with meta-learning improves performance. This difference in performance can be seen both in the reduced error and in the improved smoothness of the potential energy surface produced. We therefore show that meta-learning can utilize existing datasets with inconsistent QM levels of theory to produce models that are better at specializing to new datasets. This opens new routes for creating pre-trained, foundation models for interatomic potentials.

Original languageEnglish
Article number154
Journalnpj Computational Materials
Volume10
Issue number1
DOIs
StatePublished - Dec 2024
Externally publishedYes

Funding

This work was supported by the United States Department of Energy (US DOE), Office of Science, Basic Energy Sciences, Chemical Sciences, Geosciences, and Biosciences Division under Triad National Security, LLC (\u2018Triad\u2019) contract grant no. 89233218CNA000001 (FWP: LANLE3F2). A. E. A. Allen and S. Matin also acknowledge the Center for Nonlinear Studies. Computer time was provided by the CCS-7 Darwin cluster at LANL. LAUR-23-27568.

FundersFunder number
Basic Energy Sciences
U.S. Department of Energy
Office of Science
Chemical Sciences, Geosciences, and Biosciences Division89233218CNA000001, LANLE3F2
Center for Nonlinear StudiesLAUR-23-27568

    Fingerprint

    Dive into the research topics of 'Learning together: Towards foundation models for machine learning interatomic potentials with meta-learning'. Together they form a unique fingerprint.

    Cite this