Bridging the Gap: User-Centric Energy Monitoring for Policy-Driven Application Optimization in HPC Data Centers

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Application energy optimization in HPC data centers face two critical gaps. Systematic methodologies that connect data center policies to application decisions and accessible monitoring tools that enable data-driven optimization. We address both gaps through two complementary pillars. First, we present a methodology based on extended weighted Energy Delay Product (EDP) to translate data center operational priorities and integrate energy considerations into the energy optimization workflow which starts from continuous monitoring through targeted optimization. Second, we present a user-space monitoring tool, Omnistat, that enables this methodology by providing developers with direct access to actionable energy telemetry. Through deployment on the Frontier supercomputer and case studies exploring performance-energy trade-offs, we show how these pillars help energy as an integral optimization target for developers as active participants in data center efficiency.

Original languageEnglish
Title of host publicationProceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops
PublisherAssociation for Computing Machinery, Inc
Pages2007-2016
Number of pages10
ISBN (Electronic)9798400718717
DOIs
StatePublished - Nov 15 2025
Event2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops - St. Louis, United States
Duration: Nov 16 2025Nov 21 2025

Publication series

NameProceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops

Conference

Conference2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops
Country/TerritoryUnited States
CitySt. Louis
Period11/16/2511/21/25

Funding

This research used resources of the OLCF at ORNL, which is supported by DOE’s Office of Science under Contract No. DE-AC05-00OR22725. We are grateful to James B. White III who provided insights that greatly assisted this work.

Keywords

  • Application Optimization
  • Energy Delay Product
  • Energy Efficiency
  • Monitoring

Fingerprint

Dive into the research topics of 'Bridging the Gap: User-Centric Energy Monitoring for Policy-Driven Application Optimization in HPC Data Centers'. Together they form a unique fingerprint.

Cite this