Abstract
Optimal sub-sampling of large datasets from fluid dynamics simulations is essential for training reduced-order machine learned models. A method using Shannon entropy was developed to weight flow features according to their level of information content, such that the most informative features can be extracted and used for training a surrogate model. The method is demonstrated in the canonical flow over a cylinder problem simulated with OpenFOAM. Both time-independent predictions and temporal forecasting were investigated as well as two types of prediction targets: local per-grid-point predictions and global per-time-step predictions. When tested on training a surrogate model, results indicate that our entropy-based sampling method typically outperforms random sampling and yields more reproducible results in less iterations. Finally, the method was used to train a surrogate model for modeling turbulence in magnetohydrodynamic flows, which revealed various challenges and opportunities for future research.
Original language | English |
---|---|
Title of host publication | Proceedings of 2023 SC Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023 |
Publisher | Association for Computing Machinery |
Pages | 73-80 |
Number of pages | 8 |
ISBN (Electronic) | 9798400707858 |
DOIs | |
State | Published - Nov 12 2023 |
Event | 2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023 - Denver, United States Duration: Nov 12 2023 → Nov 17 2023 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Conference
Conference | 2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023 |
---|---|
Country/Territory | United States |
City | Denver |
Period | 11/12/23 → 11/17/23 |
Funding
This research was sponsored by and used resources of the Oak Ridge Leadership Computing Facility (OLCF), which is a DOE Office of Science User Facility at the Oak Ridge National Laboratory (ORNL) supported by the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Moreover, we would like to express our appreciation to a number of colleagues who provided useful advice during the course of this work, including Joshua Brown, Jong Choi, and Mariia Karabin of ORNL, as well as OpenAI’s GPT-4. Notice: This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (https://www.energy.gov/doe-public-access-plan).
Keywords
- clustering
- maximum entropy
- reduced-order
- sampling
- surrogate