TY - GEN
T1 - Persistent data staging services for data intensive in-situ scientific workflows
AU - Romanus, Melissa
AU - Zhang, Fan
AU - Jin, Tong
AU - Sun, Qian
AU - Bui, Hoang
AU - Parashar, Manish
AU - Choi, Jong
AU - Janhunen, Saloman
AU - Hager, Robert
AU - Klasky, Scott
AU - Chang, Choong Seock
AU - Rodero, Ivan
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/6/1
Y1 - 2016/6/1
N2 - Scientific simulation workflows executing on very large scale computing systems are essential modalities for scientific investigation. The increasing scales and resolution of these simulations provide new opportunities for accurately modeling complex natural and engineered phenomena. However, the increasing complexity necessitates managing, transporting, and processing unprecedented amounts of data, and as a result, researchers are increasingly exploring data-staging and in-situ workflows to reduce data movement and data-related overheads. However, as these workflows become more dynamic in their structures and behaviors, data staging and in-situ solutions must evolve to support new requirements. In this paper, we explore how the service-oriented concept can be applied to extreme-scale in-situ workflows. Specifically, we explore persistent data staging as a service and present the design and implementation of DataSpaces as a Service, a service-oriented data staging framework. We use a dynamically coupled fusion simulation workflow to illustrate the capabilities of this framework and evaluate its performance and scalability.
AB - Scientific simulation workflows executing on very large scale computing systems are essential modalities for scientific investigation. The increasing scales and resolution of these simulations provide new opportunities for accurately modeling complex natural and engineered phenomena. However, the increasing complexity necessitates managing, transporting, and processing unprecedented amounts of data, and as a result, researchers are increasingly exploring data-staging and in-situ workflows to reduce data movement and data-related overheads. However, as these workflows become more dynamic in their structures and behaviors, data staging and in-situ solutions must evolve to support new requirements. In this paper, we explore how the service-oriented concept can be applied to extreme-scale in-situ workflows. Specifically, we explore persistent data staging as a service and present the design and implementation of DataSpaces as a Service, a service-oriented data staging framework. We use a dynamically coupled fusion simulation workflow to illustrate the capabilities of this framework and evaluate its performance and scalability.
UR - http://www.scopus.com/inward/record.url?scp=84978896007&partnerID=8YFLogxK
U2 - 10.1145/2912152.2912157
DO - 10.1145/2912152.2912157
M3 - Conference contribution
AN - SCOPUS:84978896007
T3 - DIDC 2016 - Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing
SP - 37
EP - 44
BT - DIDC 2016 - Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing
PB - Association for Computing Machinery, Inc
T2 - 6th ACM International Workshop on Data-Intensive Distributed Computing, DIDC 2016
Y2 - 1 June 2016
ER -