Abstract
Multi-step scientific workflows have become prominent and powerful tools of data-driven scientific discovery. Run-time analytic techniques are now commonly used to mitigate the performance effects of using parallel file systems as staging areas during workflow execution. However, workflow construction and deployment for extreme-scale computing is still largely an ad hoc process with uneven support from existing tools. In this paper, we present SMARTBLOCK, an approach to designing generic, reusable components for end-to-end construction of workflows. Specifically, we demonstrate that a small set of SMARTBLOCK generic components can be reused to build a diverse set of workflows, using examples based on actual analytic processes with three well-known scientific codes. Our evaluation shows promising scaling properties as well as negligible overheads for using a modular approach over a custom, 'all-in-one' solution. As extreme-scale systems incorporate data analytics on simulation data as it is generated at rates that far outstrip available I/O bandwidth, tools such as SMARTBLOCK will become increasingly valuable for defining and deploying flexible, efficient workflows.
Original language | English |
---|---|
Title of host publication | Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1301-1308 |
Number of pages | 8 |
ISBN (Electronic) | 9781538634080 |
DOIs | |
State | Published - Jun 30 2017 |
Externally published | Yes |
Event | 31st IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017 - Orlando, United States Duration: May 29 2017 → Jun 2 2017 |
Publication series
Name | Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017 |
---|
Conference
Conference | 31st IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017 |
---|---|
Country/Territory | United States |
City | Orlando |
Period | 05/29/17 → 06/2/17 |
Funding
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. This work was supported by the U.S. Department of Energy, under FWP 15-017577 and DE-SC0016313, program manager Lucy Nowell.
Keywords
- hpc
- in situ
- pipeline
- scientific workflows