DataSpaces: An interaction and coordination framework for coupled simulation workflows

Ciprian Docan, Manish Parashar, Scott Klasky

Research output: Contribution to journalArticlepeer-review

117 Scopus citations

Abstract

Emerging high-performance distributed computing environments are enabling new end-to-end formulations in science and engineering that involve multiple interacting processes and data-intensive application workflows. For example, current fusion simulation efforts are exploring coupled models and codes that simultaneously simulate separate application processes, such as the core and the edge turbulence. These components run on different high performance computing resources, need to interact at runtime with each other and with services for data monitoring, data analysis and visualization, and data archiving. As a result, they require efficient and scalable support for dynamic and flexible couplings and interactions, which remains a challenge. This paper presents DataSpaces a flexible interaction and coordination substrate that addresses this challenge. DataSpaces essentially implements a semantically specialized virtual shared space abstraction that can be associatively accessed by all components and services in the application workflow. It enables live data to be extracted from running simulation components, indexes this data online, and then allows it to be monitored, queried and accessed by other components and services via the space using semantically meaningful operators. The underlying data transport is asynchronous, low-overhead and largely memory-to-memory. The design, implementation, and experimental evaluation of DataSpaces using a coupled fusion simulation workflow is presented.

Original languageEnglish
Pages (from-to)163-181
Number of pages19
JournalCluster Computing
Volume15
Issue number2
DOIs
StatePublished - Jun 2012

Funding

Acknowledgements The research presented in this paper is supported in part by National Science Foundation via grants numbers IIP 0758566, CCF-0833039, DMS-0835436, CNS 0426354, IIS 0430826, and CNS 0723594, by Department of Energy via the grant number DE-FG02-06ER54857, and by an IBM Faculty Award, and was conducted as part of the Center for Autonomic Computing at Rutgers University. This material was conducted while author M. Parashar was working at the National Science Foundation. Any opinion, finding, and conclusions or recommendations expressed in this material; are those of the author and do not necessarily reflect the views of the National Science Foundation.

FundersFunder number
National Science FoundationIIP 0758566, CNS 0426354, IIS 0430826, DMS-0835436, CCF-0833039, CNS 0723594
U.S. Department of EnergyDE-FG02-06ER54857
International Business Machines Corporation

    Keywords

    • Data distribution
    • Virtual shared space
    • Workflows

    Fingerprint

    Dive into the research topics of 'DataSpaces: An interaction and coordination framework for coupled simulation workflows'. Together they form a unique fingerprint.

    Cite this