Abstract
NVIDIA has been the main provider of GPU hardware in HPC systems for over a decade. Most applications that benefit from GPUs have thus been developed and optimized for the NVIDIA software stack. Recent exascale HPC systems are, however, introducing GPUs from other vendors, e.g. with the AMD GPU-based OLCF Frontier system just becoming available. AMD GPUs cannot be directly accessed using the NVIDIA software stack, and require a porting effort by the application developers. This paper provides an overview of our experience porting and optimizing the CGYRO code, a widely-used fusion simulation tool based on FORTRAN with OpenACC-based GPU acceleration. While the porting from the NVIDIA compilers was relatively straightforward using the CRAY compilers on the AMD systems, the performance optimization required more fine-tuning. In the optimization effort, we uncovered code sections that had performed well on NVIDIA GPUs, but were unexpectedly slow on AMD GPUs. After AMD-targeted code optimizations, performance on AMD GPUs has increased to meet our expectations. Modest speed improvements were also seen on NVIDIA GPUs, which was an unexpected benefit of this exercise.
Original language | English |
---|---|
Title of host publication | PEARC 2023 - Computing for the common good |
Subtitle of host publication | Practice and Experience in Advanced Research Computing |
Publisher | Association for Computing Machinery, Inc |
Pages | 246-250 |
Number of pages | 5 |
ISBN (Electronic) | 9781450399852 |
DOIs | |
State | Published - Jul 23 2023 |
Event | 2023 Practice and Experience in Advanced Research Computing, PEARC 2023 - Portland, United States Duration: Jul 23 2023 → Jul 27 2023 |
Publication series
Name | PEARC 2023 - Computing for the common good: Practice and Experience in Advanced Research Computing |
---|
Conference
Conference | 2023 Practice and Experience in Advanced Research Computing, PEARC 2023 |
---|---|
Country/Territory | United States |
City | Portland |
Period | 07/23/23 → 07/27/23 |
Funding
This work was partially supported by the U.S. Department of Energy under awards DE-FG02-95ER54309, DE-FC02-06ER54873, and DESC0017992, and by U.S. National Science Foundation (NSF) Grant OAC-1826967. An award of computer time was provided by the INCITE and ALCC programs. This research used resources of the Oak Ridge Leadership Computing Facility, which is an Office of Science User Facility supported under Contract DE-AC05-00OR22725. Computing resources were also provided by the National Energy Research Scientific Computing Center, which is an Office of Science User Facility supported under Contract DE-AC02-05CH11231.
Keywords
- Benchmarking
- FFT
- Fusion science
- GPU
- High Performance Computing
- OpenACC
- Performance