Abstract
The prevalence of scientific workflows with high computational demands calls for their execution on various distributed computing platforms, including large-scale leadership-class high-performance computing (HPC) clusters. To handle the deployment, monitoring, and optimization of workflow executions, many workflow systems have been developed over the past decade. There is a need for workflow benchmarks that can be used to evaluate the performance of workflow systems on current and future software stacks and hardware platforms.We present a generator of realistic workflow benchmark specifications that can be translated into benchmark code to be executed with current workflow systems. Our approach generates workflow tasks with arbitrary performance characteristics (CPU, memory, and I/O usage) and with realistic task dependency structures based on those seen in production workflows. We present experimental results that show that our approach generates benchmarks that are representative of production workflows, and conduct a case study to demonstrate the use and usefulness of our generated benchmarks to evaluate the performance of workflow systems under different configuration scenarios.
Original language | English |
---|---|
Title of host publication | Proceedings of PMBS 2022 |
Subtitle of host publication | Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 100-111 |
Number of pages | 12 |
ISBN (Electronic) | 9781665451857 |
DOIs | |
State | Published - 2022 |
Event | 13th IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, PMBS 2022 - Dallas, United States Duration: Nov 13 2022 → Nov 18 2022 |
Publication series
Name | Proceedings of PMBS 2022: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 13th IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, PMBS 2022 |
---|---|
Country/Territory | United States |
City | Dallas |
Period | 11/13/22 → 11/18/22 |
Funding
This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a non-exclusive, paid up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for U.S. Government purposes. The DOE will provide public access to these results in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). Acknowledgments. This work is funded by NSF contracts #2106059, #2106147, #2103489, #2103508, #1923539, and #1923621, and supported by ECP (17-SC-20-SC), a collaboratvie effort of the U.S. DOE Office of Science and the NNSA. This research used resources of the OLCF at ORNL, which is supported by the Office of Science of the U.S. DOE under Contract No. DE-AC05-00OR22725.We thank the NSF Chameleon Cloud for providing access to their resources.
Keywords
- distributed computing
- scientific workflows
- workflow benchmarks