Making Speculative Scheduling Robust to Incomplete Data

Ana Gainaru, Guillaume Pallez

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

In this work, we study the robustness of Speculative Scheduling to data incompleteness. Speculative scheduling has allowed to incorporate future types of applications into the design of HPC schedulers, specifically applications whose runtime is not perfectly known but can be modeled with probability distributions. Preliminary studies show the importance of spec- ulative scheduling in dealing with stochastic applications when the application runtime model is completely known. In this work we show how one can extract enough information even from incomplete behavioral data for a given HPC applications so that speculative scheduling still performs well. Specifically, we show that for synthetic runtimes who follow usual probability distributions such as truncated normal or exponential, we can extract enough data from as little as 10 previous runs, to be within 5% of the solution which has exact information. For real traces of applications, the performance with 10 data points varies with the applications (within 20% of the full-knowledge solution), but converges fast (5% with 100 previous samples). Finally a side effect of this study is to show the importance of the theoretical results obtained on continuous probability distributions for speculative scheduling. Indeed, we observe that the solutions for such distributions are more robust to incomplete data than the solutions for discrete distributions.

Original languageEnglish
Title of host publicationProceedings of ScalA 2019
Subtitle of host publication10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages62-71
Number of pages10
ISBN (Electronic)9781728159898
DOIs
StatePublished - Nov 2019
Externally publishedYes
Event10th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2019 - Denver, United States
Duration: Nov 18 2019 → …

Publication series

NameProceedings of ScalA 2019: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference10th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2019
Country/TerritoryUnited States
CityDenver
Period11/18/19 → …

Keywords

  • HPC scheduling
  • discrete and continuous estimators
  • perfor- mance modeling
  • stochastic applications

Fingerprint

Dive into the research topics of 'Making Speculative Scheduling Robust to Incomplete Data'. Together they form a unique fingerprint.

Cite this