Abstract
Modern simulation workflows generate and analyze massive amounts of data using I/O libraries like Adios2 and NetCDF. Although extensive work has optimized the I/O processes during the simulation phase, executing analytical queries - which often require iterative traversals of large files for insights - is cumbersome and usually constrained by low I/O performance. Instead of waiting for the analysis phase to process queries, quantities can be derived asynchronously during data production and cached, speeding up future queries. In this work, we introduce a context-aware I/O layer named 'Hades.' It is designed to efficiently derive insights from selected quantities without compromising overall workflow performance. Hades actively and asynchronously computes and stores these quantities while the data is in transit. Hades leverages a hierarchical buffering system with data access-aware prefetching to ensure quick and timely access to relevant data. It offers a flexible query interface empowering users to easily define derived quantities and provide control over data placement decisions. Hades is implemented using an Adios2 plugin engine and the Hermes buffering platform, enabling transparent use by any Adios-powered application or workflow. Experimental results demonstrate performance improvements by up to 3-4x for tested real-world scientific producer-consumer workflows.
Original language | English |
---|---|
Title of host publication | Proceedings - 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 577-586 |
Number of pages | 10 |
ISBN (Electronic) | 9798350395662 |
DOIs | |
State | Published - 2024 |
Event | 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 - Philadelphia, United States Duration: May 6 2024 → May 9 2024 |
Publication series
Name | Proceedings - 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 |
---|
Conference
Conference | 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2024 |
---|---|
Country/Territory | United States |
City | Philadelphia |
Period | 05/6/24 → 05/9/24 |
Funding
This work is supported by the U.S. Department of Energy (DOE) under DE-SC0023263.
Keywords
- Active Storage
- Context Awareness
- Data Operator
- Hierarchical Storage
- In-transit Computing
- Metadata Management