Abstract
Atomistic simulation with machine learning-based potentials (MLPs) is an emerging tool for understanding materials' properties and behaviors and predicting novel materials. Neural network potentials (NNPs) are outstanding in this field as they have shown a comparable accuracy to ab initio electronic structure calculations for reproducing potential energy surfaces while being several orders of magnitude faster. However, such NNPs can perform poorly outside their training domain and often fail catastrophically in predicting rare events in molecular dynamics (MD) simulations. The rare events in atomistic modeling typically include chemical bond breaking/formation, phase transitions, and materials failure, which are critical for new materials design, synthesis, and manufacturing processes. In this study, we develop an automated active learning (AL) capability by combining NNPs and one of the enhanced sampling methods, steered molecular dynamics, for capturing bond-breaking events of alkane chains to derive NNPs for targeted applications. We develop a decision engine based on configurational similarity and uncertainty quantification (UQ), using data augmentation for effective AL loops to distinguish the informative data from enhanced sampled configurations, showing that the generated data set achieves an activation energy error of less than 1 kcal mol−1. Furthermore, we have devised a strategy to alleviate training uncertainty within AL iterations through a carefully constructed data selection process that leverages an ensemble approach. Our study provides essential insight into the relationship between data and the performance of NNP for the rare event of bond breaking under mechanical loading. It highlights strategies for developing NNPs of broader materials and applications through active learning.
Original language | English |
---|---|
Pages (from-to) | 514-527 |
Number of pages | 14 |
Journal | Digital Discovery |
Volume | 3 |
Issue number | 3 |
DOIs | |
State | Published - Feb 20 2024 |
Funding
The authors acknowledge support by the Artificial Intelligence Initiative exploratory project as part of the Laboratory Directed Research and Development Program (LDRD) of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. This research used resources of the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.