Towards a Predictive Energy Model for HPC Runtime Systems Using Supervised Learning

Gence Ozer, Sarthak Garg, Neda Davoudi, Gabrielle Poerwawinata, Matthias Maiterth, Alessio Netti, Daniele Tafani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

High-Performance Computing systems collect vast amounts of operational data with the employment of monitoring frameworks, often augmented with additional information from schedulers and runtime systems. This amount of data can be used and turned into a benefit for operational requirements, rather than being a data pool for post-mortem analysis. This work focuses on deriving a model with supervised learning which enables optimal selection of CPU frequency during the execution of a job, with the objective of minimizing the energy consumption of a HPC system. Our model is trained utilizing sensor data and performance metrics collected with two distinct open-source frameworks for monitoring and runtime optimization. Our results show good prediction of CPU power draw and number of instructions retired on realistic dynamic runtime settings within a relatively low error margin.

Original languageEnglish
Title of host publicationEuro-Par 2019
Subtitle of host publicationParallel Processing Workshops - International Workshops, Revised Selected Papers
EditorsUlrich Schwardmann, Christian Boehme, Dora B. Heras, Valeria Cardellini, Emmanuel Jeannot, Antonio Salis, Claudio Schifanella, Ravi Reddy Manumachu, Dieter Schwamborn, Laura Ricci, Oh Sangyoon, Thomas Gruber, Laura Antonelli, Stephen L. Scott
PublisherSpringer
Pages626-638
Number of pages13
ISBN (Print)9783030483395
DOIs
StatePublished - 2020
Externally publishedYes
Event25th International European Conference on Parallel and Distributed Computing, EuroPar 2019 - Göttingen, Germany
Duration: Aug 26 2019Aug 30 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11997 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International European Conference on Parallel and Distributed Computing, EuroPar 2019
Country/TerritoryGermany
CityGöttingen
Period08/26/1908/30/19

Keywords

  • DVFS
  • Energy efficiency
  • Monitoring systems
  • Random forest
  • Runtime systems

Fingerprint

Dive into the research topics of 'Towards a Predictive Energy Model for HPC Runtime Systems Using Supervised Learning'. Together they form a unique fingerprint.

Cite this