Abstract
High performance LINPACK (HPL) remains the primary benchmark for evaluating supercomputing performance. It includes many parts with substantial internal complexity, and its performance is affected by a large number of parameters that interact in ways that are difficult to predict on large-scale heterogeneous supercomputer systems. We present a comprehensive performance analysis of HPL on Frontier, the world's first exascale supercomputer, which achieved HPL performance of 1.35 exaflops. Through empirical parameter tuning, detailed modeling, and comparative evaluation, we uncover critical performance insights, share lessons learned, and outline best practices for effective parameter tuning on exascale systems. We introduce and evaluate two novel PDFACT strategies: a dedicated-thread (DT) variant and a GPU-based variant (GPUPDFACT) implementation using HIP cooperative groups, demonstrating that GPU-based factorization outperforms conventional CPU-based PDFACT on Frontier's architecture. Our findings establish key performance factors for HPL on exascale systems and offer valuable guidance for future high-performance computing and benchmarking efforts.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025 |
| Publisher | Association for Computing Machinery, Inc |
| Pages | 397-410 |
| Number of pages | 14 |
| ISBN (Electronic) | 9798400714665 |
| DOIs | |
| State | Published - Nov 15 2025 |
| Event | 2025 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025 - St. Louis, United States Duration: Nov 16 2025 → Nov 21 2025 |
Publication series
| Name | Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025 |
|---|
Conference
| Conference | 2025 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025 |
|---|---|
| Country/Territory | United States |
| City | St. Louis |
| Period | 11/16/25 → 11/21/25 |
Funding
This manuscript has been authored in part by UT-Battelle, LLC, under contract DEAC05-00OR22725 with the U.S. Department of Energy (DOE). The U.S. government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for U.S. government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Benchmarking
- communication algorithms
- dense linear algebra
- exascale computing
- matrix factorization
- performance analysis