DYFLOW: A flexible framework for orchestrating scientific workflows on supercomputers

Swati Singhal, Alan Sussman, Matthew Wolf, Kshitij Mehta, Jong Youl Choi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Modern scientific workflows are increasing in complexity with growth in computation power, incorporation of non-traditional computation methods, and advances in technologies enabling data streaming to support on-the-fly computation. These workflows have unpredictable runtime behaviors, and a fixed, predetermined resource assignment on supercomputers can be inefficient for overall performance and throughput. Inability to change resource assignments further limits the scientists to avail of science-driven opportunities or respond to failures. We introduce DYFLOW, a flexible framework that orchestrates scientific workflows on supercomputers based on user-designed policies. DYFLOW compartmentalizes orchestration stages into simplified constructs, and end-users can program and reuse them according to their workflow requirements through an easy-to-use interface. These constructs hide the intricacies involved in runtime management from end-users, for instance, procurement of information to understand the workflow state, assessment, and supervision of the runtime changes. DYFLOW is designed to work alongside existing workflow management systems and reuse the available (static) support for workflow management. We have integrated DYFLOW with an existing workflow management tool as a demonstration. With experiments performed on use cases from three types of scientific workflows and two different parallel architectures, we show that DYFLOW achieves the desired orchestration incurring a small cost to carry out the runtime changes.

Original languageEnglish
Title of host publication50th International Conference on Parallel Processing Workshop, ICPP 2021 - Proceedings
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450384414
DOIs
StatePublished - Aug 9 2021
Event50th International Conference on Parallel Processing Workshop, ICPP 2021 - Virtual, Online, United States
Duration: Aug 9 2021Aug 12 2021

Publication series

NameACM International Conference Proceeding Series

Conference

Conference50th International Conference on Parallel Processing Workshop, ICPP 2021
Country/TerritoryUnited States
CityVirtual, Online
Period08/9/2108/12/21

Funding

This research used resources at the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. This article reports on work supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and National Nuclear Security Administration.

FundersFunder number
U.S. Department of Energy Office of Science and National Nuclear Security Administration
Office of Science17-SC-20-SC, DE-AC05-00OR22725

    Keywords

    • Dynamic workflows
    • In situ workflows
    • Policy-driven workflow orchestration
    • Resource adaptation

    Fingerprint

    Dive into the research topics of 'DYFLOW: A flexible framework for orchestrating scientific workflows on supercomputers'. Together they form a unique fingerprint.

    Cite this