On-the-fly scheduling versus reservation-based scheduling for unpredictable workflows

Ana Gainaru, Hongyang Sun, Guillaume Aupy, Yuankai Huo, Bennett A. Landman, Padma Raghavan

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Scientific insights in the coming decade will clearly depend on the effective processing of large data sets generated by dynamic heterogeneous applications typical of workflows in large data centers or of emerging fields like neuroscience. In this article, we show how these big data workflows have a unique set of characteristics that pose challenges for leveraging HPC methodologies, particularly in scheduling. Our findings indicate that execution times for these workflows are highly unpredictable and are not correlated with the size of the data set involved or the precise functions used in the analysis. We characterize this inherent variability and sketch the need for new scheduling approaches by quantifying significant gaps in achievable performance. Through simulations, we show how on-the-fly scheduling approaches can deliver benefits in both system-level and user-level performance measures. On average, we find improvements of up to 35% in system utilization and up to 45% in average stretch of the applications, illustrating the potential of increasing performance through new scheduling approaches.

Original languageEnglish
Pages (from-to)1140-1158
Number of pages19
JournalInternational Journal of High Performance Computing Applications
Volume33
Issue number6
DOIs
StatePublished - Nov 1 2019
Externally publishedYes

Funding

We thank the VUIIS Center for Computational Imaging for sharing de-identified logs without patient or investigator identifiable data. The author(s) disclosed receipt of following financial support for the research, authorship, and/or publication of this article: This research was supported in part by National Science Foundation grant CCF1719674 and Vanderbilt Institutional Fund.

FundersFunder number
National Science FoundationCCF1719674
Directorate for Computer and Information Science and Engineering1719674

    Keywords

    • On-the-fly scheduling
    • neuroscience applications
    • reservation-based scheduling
    • unpredictable workloads

    Fingerprint

    Dive into the research topics of 'On-the-fly scheduling versus reservation-based scheduling for unpredictable workflows'. Together they form a unique fingerprint.

    Cite this