WfChef: Automated generation of accurate scientific workflow generators

Taina Coleman, Henri Casanova, Rafael Ferreira Da Silva

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Scientific workflow applications have become mainstream and their automated and efficient execution on large-scale compute platforms is the object of extensive research and development. For these efforts to be successful, a solid experimental methodology is needed to evaluate workflow algorithms and systems. A foundation for this methodology is the availability of realistic workflow instances. Dozens of workflow instances for a few scientific applications are available in public repositories. While these are invaluable, they are limited: workflow instances are not available for all application scales of interest. To address this limitation, previous work has developed generators of synthetic, but representative, workflow instances of arbitrary scales. These generators are popular, but implementing them is a manual, labor-intensive process that requires expert application knowledge. As a result, these generators only target a handful of applications, even though hundreds of applications use workflows in production.In this work, we present WfChef, a framework that fully automates the process of constructing a synthetic workflow generator for any scientific application. Based on an input set of workflow instances, WfChef automatically produces a synthetic workflow generator. We define and evaluate several metrics for quantifying the realism of the generated workflows. Using these metrics, we compare the realism of the workflows generated by WfChef generators to that of the workflows generated by the previously available, hand-crafted generators. We find that the WfChef generators not only require zero development effort (because it is automatically produced), but also generate workflows that are more realistic than those generated by hand-crafted generators.

Original languageEnglish
Title of host publicationProceedings - IEEE 17th International Conference on eScience, eScience 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages159-168
Number of pages10
ISBN (Electronic)9781665403610
DOIs
StatePublished - Sep 2021
Event17th IEEE International Conference on eScience, eScience 2021 - Virtual, Online, Austria
Duration: Sep 20 2021Sep 23 2021

Publication series

NameProceedings - IEEE 17th International Conference on eScience, eScience 2021

Conference

Conference17th IEEE International Conference on eScience, eScience 2021
Country/TerritoryAustria
CityVirtual, Online
Period09/20/2109/23/21

Funding

Acknowledgments. This work is funded by NSF contracts #1923539 and #1923621; and partly funded by NSF contracts #2016610, and #2016619. We also thank the NSF Chameleon Cloud for providing time grants to access their resources.

Keywords

  • Scientific workflows
  • Synthetic workflow generation
  • Workflow management systems

Fingerprint

Dive into the research topics of 'WfChef: Automated generation of accurate scientific workflow generators'. Together they form a unique fingerprint.

Cite this