Abstract
High-fidelity numerical simulations are necessary to drive design choices for future fusion devices, e.g. the ITER tokamak. XGC is a gyrokinetic Particle-in-Cell (PIC) application optimized for modeling the edge region plasma. The Coulomb collision operator is one of the more computationally expensive components of XGC. It requires linear solutions for a large number of small matrices with an identical sparsity pattern. These are still performed on the CPU, a major bottleneck given that exascale-class machines have over 95% of their compute performance on the GPUs. As the collision operator matrices are sparse, well-conditioned, and of medium size, batched iterative solvers utilizing sparse data structures are an attractive option. We showcase the acceleration of XGC with an integration of the Ginkgo batched iterative solvers with realistic test cases from ITER and DIII-D. We build on our previous work, which focused on integration into a collision kernel proxy application, showing the substantial promise of Ginkgo’s solvers. We present results obtained from three platforms: NVIDIA A100 GPUs (NERSC Perlmutter), AMD MI250X GPUs (OLCF Frontier) and Intel Max 1550 GPUs (ALCF Aurora) and show the reduction in time provided by the Ginkgo solver compared with the CPU solver. We present a weak scaling study to almost full-scale on the NVIDIA platform. The results show that Ginkgo’s batched sparse iterative solvers enable efficient utilization of the GPU for this problem. The performance portability of Ginkgo in conjunction with Kokkos (used within XGC as the heterogeneous programming model) allows seamless execution on exascale-oriented heterogeneous architectures.
| Original language | English |
|---|---|
| Title of host publication | High Performance Computing. ISC High Performance 2024 International Workshops, Revised Selected Papers |
| Editors | Michèle Weiland, Sarah Neuwirth, Carola Kruse, Tobias Weinzierl |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 127-140 |
| Number of pages | 14 |
| ISBN (Print) | 9783031737152 |
| DOIs | |
| State | Published - 2025 |
| Event | 39th ISC High Performance conference, ISC-HPC 2024 - Hamburg, Germany Duration: May 12 2024 → May 16 2024 |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Volume | 15058 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 39th ISC High Performance conference, ISC-HPC 2024 |
|---|---|
| Country/Territory | Germany |
| City | Hamburg |
| Period | 05/12/24 → 05/16/24 |
Funding
We thank Dr. Seung-Hoe Ku (PPPL) for providing the EM test case and Dr. Timothy Williams (ANL) for porting XGC to Aurora. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. It used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231.
Keywords
- Batched solvers
- GPU computing
- Large application use-cases
- Performance portability
- Plasma physics