Skip to main navigation Skip to search Skip to main content

Power-Capping Metric Evaluation for Improving Energy Efficiency in HPC Applications

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With high-performance computing systems now running at exascale, optimizing power-scaling management and resource utilization has become more critical than ever. This paper explores runtime power-capping optimizations that leverage integrated CPU-GPU power management on architectures like the NVIDIA GH200 superchip. We evaluate energy-performance metrics that account for simultaneous CPU and GPU power-capping effects by using two complementary approaches: speedup-energy-delay and a Euclidean distance-based multi-objective optimization method. By targeting a mostly compute-bound exascale science application, the Locally Self-Consistent Multiple Scattering (LSMS), we explore challenging scenarios to identify potential opportunities for energy savings in exascale applications, and we recognize that even modest reductions in energy consumption can have significant overall impacts. Our results highlight how GPU task-specific dynamic power-cap adjustments combined with integrated CPU-GPU power steering can improve the energy utilization of certain GPU tasks, thereby laying the groundwork for future adaptive optimization strategies.

Original languageEnglish
Title of host publicationHigh Performance Computing - ISC High Performance 2025 International Workshops, Revised Selected Papers
EditorsSarah Neuwirth, Arnab Kumar Paul, Tobias Weinzierl, Erin Claire Carson
PublisherSpringer Science and Business Media Deutschland GmbH
Pages231-244
Number of pages14
ISBN (Print)9783032076113
DOIs
StatePublished - 2026
Event40th International Conference on High Performance Computing, ISC High Performance 2025 - Hamburg, Germany
Duration: Jun 10 2025Jun 13 2025

Publication series

NameLecture Notes in Computer Science
Volume16091 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference40th International Conference on High Performance Computing, ISC High Performance 2025
Country/TerritoryGermany
CityHamburg
Period06/10/2506/13/25

Funding

This material is based on work supported by the US Department of Energy’s Office of Science, Advanced Scientific Computing Research program through EXPRESS: 2023 Exploratory Research for Extreme-Scale Science. This research used resources of the Oak Ridge Leadership Computing Facility at Oak Ridge National Laboratory, which is supported by the Office of Science of the US Department of Energy under contract DE-AC05-00OR22725.

Keywords

  • Automatic Power Steering
  • Energy Efficiency
  • Exascale Applications
  • GH200
  • HPC
  • LSMS
  • Performance Metrics
  • Power Capping

Fingerprint

Dive into the research topics of 'Power-Capping Metric Evaluation for Improving Energy Efficiency in HPC Applications'. Together they form a unique fingerprint.

Cite this