Abstract
State-of-the-art multiphysics simulations running on large scale leadership computing platforms have many variables contributing to their performance and scaling behavior. We recently encountered an interesting performance anomaly in Flash-X, a multiphysics multicomponent simulation software, when characterizing its performance behavior on several large-scale HPC platforms. The anomaly was tracked down to the interaction between the use of dynamic allocation of scratch data and data locality in the cache hierarchy. In this paper we present the details of unexpected performance variability of Flash-X, its extensive analysis using the performance measurement tool TAU to collect the data and Python data analysis libraries to explore the data, and our insights from this experience. In this process, we discovered and removed or mitigated two additional performance limiting bottlenecks for performance tuning.
Original language | English |
---|---|
Title of host publication | Proceedings of ProTools 2022 |
Subtitle of host publication | Workshop on Programming and Performance Visualization Tools, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 21-30 |
Number of pages | 10 |
ISBN (Electronic) | 9781665475648 |
DOIs | |
State | Published - 2022 |
Event | 4th IEEE/ACM Workshop on Programming and Performance Visualization Tools, ProTools 2022 - Dallas, United States Duration: Nov 13 2022 → Nov 18 2022 |
Publication series
Name | Proceedings of ProTools 2022: Workshop on Programming and Performance Visualization Tools, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 4th IEEE/ACM Workshop on Programming and Performance Visualization Tools, ProTools 2022 |
---|---|
Country/Territory | United States |
City | Dallas |
Period | 11/13/22 → 11/18/22 |
Funding
of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies publicly and display publicly, by or on behalf of the Government. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan. http://energy.gov/downloads/doe-public-access-plan. This work was also supported by the Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research (ASCR) under contract DE-SC0021299.
Keywords
- Multiphysics application, adaptive mesh refinement, performance, data analysis