Abstract
Integration of machine learning with simulation is part of a growing trend, however, the augmentation of codes in a highly-performant, distributed manner poses a software development challenge. In this work, we explore the question of how to easily augment legacy simulation codes on high-performance computers (HPCs) with machine-learned surrogate models, in a fast, scalable manner. Initial naïve augmentation attempts required significant code modification and resulted in significant slowdown. This led us to explore inference server techniques, which allow for model calls through drop-in functions. In this work, we investigated TensorFlow Serving with $\mathbf{gRPC}$ and RedisAI with SmartRedis for server-client inference implementations, where the deep learning platform runs as a persistent process on HPC compute node GPUs and the simulation makes client calls while running on the CPUs. We evaluated inference performance for several use cases on SCOUT, an IBM POWER9 supercomputer, including, real gas equations of state, machine-learned boundary conditions for rotorcraft aerodynamics, and super-resolution techniques. We will discuss key findings on performance. The lessons learned may provide useful advice for researchers to augment their simulation codes in an optimal manner.
Original language | English |
---|---|
Title of host publication | Proceedings of AI4S 2022 |
Subtitle of host publication | Artificial Intelligence and Machine Learning for Scientific Applications, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 44-49 |
Number of pages | 6 |
ISBN (Electronic) | 9781665462075 |
DOIs | |
State | Published - 2022 |
Externally published | Yes |
Event | 3rd IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications, AI4S 2022 - Dallas, United States Duration: Nov 13 2022 → Nov 18 2022 |
Publication series
Name | Proceedings of AI4S 2022: Artificial Intelligence and Machine Learning for Scientific Applications, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 3rd IEEE/ACM International Workshop on Artificial Intelligence and Machine Learning for Scientific Applications, AI4S 2022 |
---|---|
Country/Territory | United States |
City | Dallas |
Period | 11/13/22 → 11/18/22 |
Funding
Distribution Statement A. Approved for Public Release; Distribution Unlimited. DoD HPCMP Approval 22-31. This material is based upon work supported by, or in part by, the Department of Defense High Performance Computing Modernization Program (HPCMP) under User Productivity, Enhanced Technology Transfer, and Training (PET) contracts #GS04T09DBC0017 and #47QFSA18K0111. Any opinions, finding and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the DoD HPCMP.
Keywords
- HPC
- inference
- surrogate