Insights from Optimizing HPL Performance on Exascale Systems: A Comparative Analysis of Panel Factorization

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

High performance LINPACK (HPL) remains the primary benchmark for evaluating supercomputing performance. It includes many parts with substantial internal complexity, and its performance is affected by a large number of parameters that interact in ways that are difficult to predict on large-scale heterogeneous supercomputer systems. We present a comprehensive performance analysis of HPL on Frontier, the world's first exascale supercomputer, which achieved HPL performance of 1.35 exaflops. Through empirical parameter tuning, detailed modeling, and comparative evaluation, we uncover critical performance insights, share lessons learned, and outline best practices for effective parameter tuning on exascale systems. We introduce and evaluate two novel PDFACT strategies: a dedicated-thread (DT) variant and a GPU-based variant (GPUPDFACT) implementation using HIP cooperative groups, demonstrating that GPU-based factorization outperforms conventional CPU-based PDFACT on Frontier's architecture. Our findings establish key performance factors for HPL on exascale systems and offer valuable guidance for future high-performance computing and benchmarking efforts.

Original languageEnglish
Title of host publicationProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025
PublisherAssociation for Computing Machinery, Inc
Pages397-410
Number of pages14
ISBN (Electronic)9798400714665
DOIs
StatePublished - Nov 15 2025
Event2025 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025 - St. Louis, United States
Duration: Nov 16 2025Nov 21 2025

Publication series

NameProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025

Conference

Conference2025 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025
Country/TerritoryUnited States
CitySt. Louis
Period11/16/2511/21/25

Funding

This manuscript has been authored in part by UT-Battelle, LLC, under contract DEAC05-00OR22725 with the U.S. Department of Energy (DOE). The U.S. government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for U.S. government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Keywords

  • Benchmarking
  • communication algorithms
  • dense linear algebra
  • exascale computing
  • matrix factorization
  • performance analysis

Fingerprint

Dive into the research topics of 'Insights from Optimizing HPL Performance on Exascale Systems: A Comparative Analysis of Panel Factorization'. Together they form a unique fingerprint.

Cite this