Abstract
The rise of machine learning (ML) applications and their use of mixed precision to perform interesting science are driving forces behind AI for science on HPC. The convergence of ML and HPC with mixed precision offers the possibility of transformational changes in computational science. The HPL-AI benchmark is designed to measure the performance of mixed precision arithmetic as opposed to the HPL benchmark which measures double precision performance. Pushing the limits of systems at extreme scale is nontrivial -little public literature explores optimization of mixed precision computations at this scale. In this work, we demonstrate how to scale up the HPL-AI benchmark on the pre-exascale Summit and exascale Frontier systems at the Oak Ridge Leadership Computing Facility (OLCF) with a cross-platform design. We present the implementation, performance results, and a guideline of optimization strategies employed for delivering portable performance on both AMD and NVIDIA GPUs at extreme scale.
Original language | English |
---|---|
Title of host publication | Proceedings of SC 2022 |
Subtitle of host publication | International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | IEEE Computer Society |
ISBN (Electronic) | 9781665454445 |
DOIs | |
State | Published - 2022 |
Event | 2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022 - Dallas, United States Duration: Nov 13 2022 → Nov 18 2022 |
Publication series
Name | International Conference for High Performance Computing, Networking, Storage and Analysis, SC |
---|---|
Volume | 2022-November |
ISSN (Print) | 2167-4329 |
ISSN (Electronic) | 2167-4337 |
Conference
Conference | 2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022 |
---|---|
Country/Territory | United States |
City | Dallas |
Period | 11/13/22 → 11/18/22 |
Funding
This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Exascale computing
- High performance computing
- Linear algebra
- Parallel programming