The Challenge of Disproportionate Importance of Temporal Features in Predicting HPC Power Consumption

Chengcheng Li, Ahmad M. Karimi, Woong Shin, Hairong Qi, Feiyi Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

In this work, we demonstrate the challenges in predicting HPC cluster power consumption in the face of significant temporal skew in power consumption behavioral patterns. Predicting large power swings that extend several megawatts has significant operational value for HPC centers, however, prediction is challenging due to the relative rarity of such events and also due to the abrupt or disjoint deviation from the average power consumption levels. To study the impact of this challenge, we have trained a recurrent neural network (RNN) as a reasonably sophisticated model to predict power consumption of the oneyear worth of node power consumption data from the Summit supercomputer located in the Oak Ridge Leadership Computing Facility. By studying the prediction results, we have found that although simple usage of RNN models can provide good results on average power consumption levels, it would fail at predicting the power swings that have more operational value. With such results, we discuss potential next steps in addressing such issues aiming towards a robust usage of power prediction techniques in HPC operations.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE International Conference on Cluster Computing, Cluster 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages632-636
Number of pages5
ISBN (Electronic)9781728196664
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Cluster Computing, Cluster 2021 - Virtual, Portland, United States
Duration: Sep 7 2021Sep 10 2021

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
Volume2021-September
ISSN (Print)1552-5244

Conference

Conference2021 IEEE International Conference on Cluster Computing, Cluster 2021
Country/TerritoryUnited States
CityVirtual, Portland
Period09/7/2109/10/21

Funding

This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This work was supported by, and used the resources of, the Oak Ridge Leadership Computing Facility, located in the National Center for Computational Sciences at ORNL, which is managed by UT Battelle, LLC for the U.S. DOE (under the contract No. DE-AC05-00OR22725).

FundersFunder number
U.S. Department of EnergyDE-AC05-00OR22725

    Keywords

    • HPC
    • Machine learning
    • Power consumption
    • Time-series prediction

    Fingerprint

    Dive into the research topics of 'The Challenge of Disproportionate Importance of Temporal Features in Predicting HPC Power Consumption'. Together they form a unique fingerprint.

    Cite this