Periodic I/O scheduling for super-computers

Guillaume Aupy, Ana Gainaru, Valentin Le Fèvre

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

With the ever-growing need of data in HPC applications, the congestion at the I/O level becomes critical in super-computers. Architectural enhancement such as burst-buffers and pre-fetching are added to machines, but are not sufficient to prevent congestion. Recent online I/O scheduling strategies have been put in place, but they add an additional congestion point and overheads in the computation of applications. In this work, we show how to take advantage of the periodic nature of HPC applications in order to develop efficient periodic scheduling strategies for their I/O transfers. Our strategy computes once during the job scheduling phase a pattern where it defines the I/O behavior for each application, after which the applications run independently, transferring their I/O at the specified times. Our strategy limits the amount of I/O congestion at the I/O node level and can be easily integrated into current job schedulers. We validate this model through extensive simulations and experiments by comparing it to state-of-the-art online solutions. Specifically, we show that not only our scheduler has the advantage of being de-centralized, thus overcoming the overhead of online schedulers, but we also show that on Mira one can expect an average dilation improvement of 22% with an average throughput improvement of 32%! Finally, we show that one can expect those improvements to get better in the next generation of platforms where the compute - I/O bandwidth imbalance increases.

Original languageEnglish
Title of host publicationHigh Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation - 8th International Workshop, Proceedings
EditorsSimon Hammond, Stephen Jarvis, Steven Wright
PublisherSpringer Verlag
Pages44-66
Number of pages23
ISBN (Print)9783319729701
DOIs
StatePublished - 2018
Externally publishedYes
Event8th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, PMBS 2017 - [state] CO, United States
Duration: Nov 13 2017Nov 13 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10724 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computing Systems, PMBS 2017
Country/TerritoryUnited States
City[state] CO
Period11/13/1711/13/17

Funding

Acknowledgement. This work was supported in part by the ANR Dash project. Part of this work was done when Guillaume Aupy and Valentin Le Fèvre were in Vanderbilt University. The authors would like to thank Anne Benoit and Yves Robert for helpful discussions.

Fingerprint

Dive into the research topics of 'Periodic I/O scheduling for super-computers'. Together they form a unique fingerprint.

Cite this