Abstract
Modern supercomputing systems are increasingly reliant on hierarchical, multi-tiered file and storage system architectures due to cost-performance-capacity trade-offs. Within such multi-tiered systems, data management services are required to maintain healthy utilization, performance, and capacity levels. We present PoliMOR, a pragmatic and reliable policy-driven data management framework. PoliMOR is composed of modular, single-purpose agents that gather file system metadata and enforce policies on storage systems. PoliMOR facilitates automated and scalable data management with customizable agents tailored to HPC facility-specific storage systems and policies. Our evaluations demonstrate the scalability and performance of PoliMOR both by its individual agents and as a collective entity. We believe PoliMOR is widely applicable across HPC facilities with large-scale data management challenges and will garner interest from the HPC community, given its flexible and open-source nature.
Original language | English |
---|---|
Title of host publication | Proceedings of 2023 SC Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023 |
Publisher | Association for Computing Machinery |
Pages | 1202-1208 |
Number of pages | 7 |
ISBN (Electronic) | 9798400707858 |
DOIs | |
State | Published - Nov 12 2023 |
Event | 2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023 - Denver, United States Duration: Nov 12 2023 → Nov 17 2023 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Conference
Conference | 2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023 |
---|---|
Country/Territory | United States |
City | Denver |
Period | 11/12/23 → 11/17/23 |
Funding
This research was sponsored by and used resources of the Oak Ridge Leadership Computing Facility (OLCF), which is a DOE Office of Science User Facility at the Oak Ridge National Laboratory supported by the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.
Keywords
- high performance computing
- message queues
- multi-tiered parallel file system
- policy engine
- storage and data management