TY - GEN
T1 - Making Speculative Scheduling Robust to Incomplete Data
AU - Gainaru, Ana
AU - Pallez, Guillaume
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - In this work, we study the robustness of Speculative Scheduling to data incompleteness. Speculative scheduling has allowed to incorporate future types of applications into the design of HPC schedulers, specifically applications whose runtime is not perfectly known but can be modeled with probability distributions. Preliminary studies show the importance of spec- ulative scheduling in dealing with stochastic applications when the application runtime model is completely known. In this work we show how one can extract enough information even from incomplete behavioral data for a given HPC applications so that speculative scheduling still performs well. Specifically, we show that for synthetic runtimes who follow usual probability distributions such as truncated normal or exponential, we can extract enough data from as little as 10 previous runs, to be within 5% of the solution which has exact information. For real traces of applications, the performance with 10 data points varies with the applications (within 20% of the full-knowledge solution), but converges fast (5% with 100 previous samples). Finally a side effect of this study is to show the importance of the theoretical results obtained on continuous probability distributions for speculative scheduling. Indeed, we observe that the solutions for such distributions are more robust to incomplete data than the solutions for discrete distributions.
AB - In this work, we study the robustness of Speculative Scheduling to data incompleteness. Speculative scheduling has allowed to incorporate future types of applications into the design of HPC schedulers, specifically applications whose runtime is not perfectly known but can be modeled with probability distributions. Preliminary studies show the importance of spec- ulative scheduling in dealing with stochastic applications when the application runtime model is completely known. In this work we show how one can extract enough information even from incomplete behavioral data for a given HPC applications so that speculative scheduling still performs well. Specifically, we show that for synthetic runtimes who follow usual probability distributions such as truncated normal or exponential, we can extract enough data from as little as 10 previous runs, to be within 5% of the solution which has exact information. For real traces of applications, the performance with 10 data points varies with the applications (within 20% of the full-knowledge solution), but converges fast (5% with 100 previous samples). Finally a side effect of this study is to show the importance of the theoretical results obtained on continuous probability distributions for speculative scheduling. Indeed, we observe that the solutions for such distributions are more robust to incomplete data than the solutions for discrete distributions.
KW - HPC scheduling
KW - discrete and continuous estimators
KW - perfor- mance modeling
KW - stochastic applications
UR - http://www.scopus.com/inward/record.url?scp=85078706233&partnerID=8YFLogxK
U2 - 10.1109/ScalA49573.2019.00013
DO - 10.1109/ScalA49573.2019.00013
M3 - Conference contribution
AN - SCOPUS:85078706233
T3 - Proceedings of ScalA 2019: 10th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 62
EP - 71
BT - Proceedings of ScalA 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 10th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2019
Y2 - 18 November 2019
ER -