TY - GEN
T1 - SmartBlock
T2 - 31st IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017
AU - Champsaur, Alexis
AU - Lofstead, Jay
AU - Dayal, Jai
AU - Wolf, Matthew
AU - Eisenhauer, Greg
AU - Widener, Patrick
AU - Gavrilovska, Ada
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/6/30
Y1 - 2017/6/30
N2 - Multi-step scientific workflows have become prominent and powerful tools of data-driven scientific discovery. Run-time analytic techniques are now commonly used to mitigate the performance effects of using parallel file systems as staging areas during workflow execution. However, workflow construction and deployment for extreme-scale computing is still largely an ad hoc process with uneven support from existing tools. In this paper, we present SMARTBLOCK, an approach to designing generic, reusable components for end-to-end construction of workflows. Specifically, we demonstrate that a small set of SMARTBLOCK generic components can be reused to build a diverse set of workflows, using examples based on actual analytic processes with three well-known scientific codes. Our evaluation shows promising scaling properties as well as negligible overheads for using a modular approach over a custom, 'all-in-one' solution. As extreme-scale systems incorporate data analytics on simulation data as it is generated at rates that far outstrip available I/O bandwidth, tools such as SMARTBLOCK will become increasingly valuable for defining and deploying flexible, efficient workflows.
AB - Multi-step scientific workflows have become prominent and powerful tools of data-driven scientific discovery. Run-time analytic techniques are now commonly used to mitigate the performance effects of using parallel file systems as staging areas during workflow execution. However, workflow construction and deployment for extreme-scale computing is still largely an ad hoc process with uneven support from existing tools. In this paper, we present SMARTBLOCK, an approach to designing generic, reusable components for end-to-end construction of workflows. Specifically, we demonstrate that a small set of SMARTBLOCK generic components can be reused to build a diverse set of workflows, using examples based on actual analytic processes with three well-known scientific codes. Our evaluation shows promising scaling properties as well as negligible overheads for using a modular approach over a custom, 'all-in-one' solution. As extreme-scale systems incorporate data analytics on simulation data as it is generated at rates that far outstrip available I/O bandwidth, tools such as SMARTBLOCK will become increasingly valuable for defining and deploying flexible, efficient workflows.
KW - hpc
KW - in situ
KW - pipeline
KW - scientific workflows
UR - http://www.scopus.com/inward/record.url?scp=85028045378&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2017.149
DO - 10.1109/IPDPSW.2017.149
M3 - Conference contribution
AN - SCOPUS:85028045378
T3 - Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017
SP - 1301
EP - 1308
BT - Proceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 29 May 2017 through 2 June 2017
ER -