Optimal checkpointing period: Time vs. energy

Guillaume Aupy, Anne Benoit, Thomas Hérault, Yves Robert, Jack Dongarra, Yves Robert

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

This short paper deals with parallel scientific applications using non-blocking and periodic coordinated checkpointing to enforce resilience. We provide a model and detailed formulas for total execution time and consumed energy. We characterize the optimal period for both objectives, and we assess the range of time/energy trade-offs to be made by instantiating the model with a set of realistic scenarios for Exascale systems. We give a particular emphasis to I/O transfers, because the relative cost of communication is expected to dramatically increase, both in terms of latency and consumed energy, for future Exascale platforms.

Original languageEnglish
Title of host publicationHigh Performance Computing Systems
Subtitle of host publicationPerformance Modeling, Benchmarking and Simulation - 4th International Workshop, PMBS 2013, Revised Selected Papers
EditorsStephen A. Jarvis, Steven A. Wright, Simon D. Hammond
PublisherSpringer Verlag
Pages203-214
Number of pages12
ISBN (Electronic)9783319102139
DOIs
StatePublished - 2014
Externally publishedYes
Event4th International Workshop on Performance Modeling, Benchmarking and Simulation of High-Performance Computing Systems, PMBS 2013 - Denver, United States
Duration: Nov 18 2013Nov 18 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8551
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th International Workshop on Performance Modeling, Benchmarking and Simulation of High-Performance Computing Systems, PMBS 2013
Country/TerritoryUnited States
CityDenver
Period11/18/1311/18/13

Fingerprint

Dive into the research topics of 'Optimal checkpointing period: Time vs. energy'. Together they form a unique fingerprint.

Cite this