TY - JOUR
T1 - Enabling high-speed asynchronous data extraction and transfer using DART
AU - Docan, Ciprian
AU - Parashar, Manish
AU - Klasky, Scott
PY - 2010/6/25
Y1 - 2010/6/25
N2 - As the complexity and scale of applications grow, managing and transporting the large amounts of data they generate are quickly becoming a significant challenge. Moreover, the interactive and real-time nature of emerging applications, as well as their increasing runtime, make online data extraction and analysis a key requirement in addition to traditional data I/O and archiving. To be effective, online data extraction and transfer should impose minimal additional synchronization requirements, should have minimal impact on the computational performance and communication latencies, maintain overall quality of service, and ensure that no data is lost. In this paper we present Decoupled and Asynchronous Remote Transfers (DART), an efficient data transfer substrate that effectively addresses these requirements. DART is a thin software layer built on RDMA technology to enable fast, low-overhead, and asynchronous access to data from a running simulation, and supports high-throughput, low-latency data transfers. DART has been integrated with applications simulating fusion plasma in a Tokamak, being developed at the Center for Plasma Edge Simulation (CPES), a DoE Office of Fusion Energy Science (OFES) Fusion Simulation Project (FSP). A performance evaluation using the Gyrokinetic Toroidal Code and XGC-1 particle-in-cell-based FSP simulations running on the Cray XT3/XT4 system at Oak Ridge National Laboratory demonstrates how DART can effectively and efficiently offload simulation data to local service and remote analysis nodes, with minimal overheads on the simulation itself.
AB - As the complexity and scale of applications grow, managing and transporting the large amounts of data they generate are quickly becoming a significant challenge. Moreover, the interactive and real-time nature of emerging applications, as well as their increasing runtime, make online data extraction and analysis a key requirement in addition to traditional data I/O and archiving. To be effective, online data extraction and transfer should impose minimal additional synchronization requirements, should have minimal impact on the computational performance and communication latencies, maintain overall quality of service, and ensure that no data is lost. In this paper we present Decoupled and Asynchronous Remote Transfers (DART), an efficient data transfer substrate that effectively addresses these requirements. DART is a thin software layer built on RDMA technology to enable fast, low-overhead, and asynchronous access to data from a running simulation, and supports high-throughput, low-latency data transfers. DART has been integrated with applications simulating fusion plasma in a Tokamak, being developed at the Center for Plasma Edge Simulation (CPES), a DoE Office of Fusion Energy Science (OFES) Fusion Simulation Project (FSP). A performance evaluation using the Gyrokinetic Toroidal Code and XGC-1 particle-in-cell-based FSP simulations running on the Cray XT3/XT4 system at Oak Ridge National Laboratory demonstrates how DART can effectively and efficiently offload simulation data to local service and remote analysis nodes, with minimal overheads on the simulation itself.
KW - Asynchronous transfers
KW - Low-overhead
KW - RDMA
UR - http://www.scopus.com/inward/record.url?scp=77953455120&partnerID=8YFLogxK
U2 - 10.1002/cpe.1567
DO - 10.1002/cpe.1567
M3 - Article
AN - SCOPUS:77953455120
SN - 1532-0626
VL - 22
SP - 1181
EP - 1204
JO - Concurrency and Computation: Practice and Experience
JF - Concurrency and Computation: Practice and Experience
IS - 9
ER -