TY - GEN
T1 - Evaluating Scientific Workflow Engines for Data and Compute Intensive Discoveries
AU - Singh, Rina
AU - Graves, Jeffrey A.
AU - Anantharaj, Valentine
AU - Sukumar, Sreenivas R.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/12
Y1 - 2019/12
N2 - Workflow engines used to script scientific experiments involving numerical simulation, data analysis, instruments, edge sensors, and artificial intelligence have to deal with the complexities of hardware, software, resource availability, and the collaborative nature of science. In this paper, we survey workflow engines used in data-intensive and compute-intensive discovery pipelines from scientific disciplines such as astronomy, high energy physics, earth system science, bio-medicine, and material science and present a qualitative analysis of their respective capabilities. We compare 5 popular workflow engines and their differentiated approach to job orchestration, job launching, data management and provenance, security authentication, ease-ofuse, workflow description, and scripting semantics. The comparisons presented in this paper allow practitioners to choose the appropriate engine for their scientific experiment and lead to recommendations for future work.
AB - Workflow engines used to script scientific experiments involving numerical simulation, data analysis, instruments, edge sensors, and artificial intelligence have to deal with the complexities of hardware, software, resource availability, and the collaborative nature of science. In this paper, we survey workflow engines used in data-intensive and compute-intensive discovery pipelines from scientific disciplines such as astronomy, high energy physics, earth system science, bio-medicine, and material science and present a qualitative analysis of their respective capabilities. We compare 5 popular workflow engines and their differentiated approach to job orchestration, job launching, data management and provenance, security authentication, ease-ofuse, workflow description, and scripting semantics. The comparisons presented in this paper allow practitioners to choose the appropriate engine for their scientific experiment and lead to recommendations for future work.
KW - Converged Workloads
KW - Data Intensive Discoveries
KW - End-to-End Workflows
KW - Scientific Experiments
KW - Scientific Workflows
KW - Workflow Engines
UR - http://www.scopus.com/inward/record.url?scp=85081335274&partnerID=8YFLogxK
U2 - 10.1109/BigData47090.2019.9006223
DO - 10.1109/BigData47090.2019.9006223
M3 - Conference contribution
AN - SCOPUS:85081335274
T3 - Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
SP - 4553
EP - 4560
BT - Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
A2 - Baru, Chaitanya
A2 - Huan, Jun
A2 - Khan, Latifur
A2 - Hu, Xiaohua Tony
A2 - Ak, Ronay
A2 - Tian, Yuanyuan
A2 - Barga, Roger
A2 - Zaniolo, Carlo
A2 - Lee, Kisung
A2 - Ye, Yanfang Fanny
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE International Conference on Big Data, Big Data 2019
Y2 - 9 December 2019 through 12 December 2019
ER -