Alea – Complex Job Scheduling Simulator

Dalibor Klusáček, Mehmet Soysal, Frédéric Suter

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Using large computer systems such as HPC clusters up to their full potential can be hard. Many problems and inefficiencies relate to the interactions of user workloads and system-level policies. These policies enable various setup choices of the resource management system (RMS) as well as the applied scheduling policy. While expert’s assessment and well known best practices do their job when tuning the performance, there is usually plenty of room for further improvements, e.g., by considering more efficient system setups or even radically new scheduling policies. For such potentially damaging modifications it is very suitable to use some form of a simulator first, which allows for repeated evaluations of various setups in a fully controlled manner. This paper presents the latest improvements and advanced simulation capabilities of the Alea job scheduling simulator that has been actively developed for over 10 years now. We present both recently added advanced simulation capabilities as well as a set of real-life based case studies where Alea has been used to evaluate major modifications of real HPC and HTC systems.

Original languageEnglish
Title of host publicationParallel Processing and Applied Mathematics - 13th International Conference, PPAM 2019, Revised Selected Papers
EditorsRoman Wyrzykowski, Konrad Karczewski, Ewa Deelman, Jack Dongarra
PublisherSpringer
Pages217-229
Number of pages13
ISBN (Print)9783030432218
DOIs
StatePublished - 2020
Externally publishedYes
Event13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019 - Bialystok, Poland
Duration: Sep 8 2019Sep 11 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12044 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Parallel Processing and Applied Mathematics, PPAM 2019
Country/TerritoryPoland
CityBialystok
Period09/8/1909/11/19

Funding

Acknowledgments. We acknowledge the support and computational resources provided by the MetaCentrum under the program LM2015042, and the support provided by the project Reg. No. CZ.02.1.01/0.0/0.0/16 013/0001797 co-funded by the Ministry of Education, Youth and Sports of the Czech Republic.

FundersFunder number
MetaCentrumCZ.02.1.01/0.0/0.0/16 013/0001797, LM2015042
Ministerstvo Školství, Mládeže a Tělovýchovy

    Keywords

    • Alea
    • HPC
    • HTC
    • Scheduling
    • Simulation

    Fingerprint

    Dive into the research topics of 'Alea – Complex Job Scheduling Simulator'. Together they form a unique fingerprint.

    Cite this