"PoliMOR: A Policy Engine \"Made-to-Order\" for Automated and Scalable Data Management in Lustre"

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Modern supercomputing systems are increasingly reliant on hierarchical, multi-tiered file and storage system architectures due to cost-performance-capacity trade-offs. Within such multi-tiered systems, data management services are required to maintain healthy utilization, performance, and capacity levels. We present PoliMOR, a pragmatic and reliable policy-driven data management framework. PoliMOR is composed of modular, single-purpose agents that gather file system metadata and enforce policies on storage systems. PoliMOR facilitates automated and scalable data management with customizable agents tailored to HPC facility-specific storage systems and policies. Our evaluations demonstrate the scalability and performance of PoliMOR both by its individual agents and as a collective entity. We believe PoliMOR is widely applicable across HPC facilities with large-scale data management challenges and will garner interest from the HPC community, given its flexible and open-source nature.

Original languageEnglish
Title of host publicationProceedings of 2023 SC Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023
PublisherAssociation for Computing Machinery
Pages1202-1208
Number of pages7
ISBN (Electronic)9798400707858
DOIs
StatePublished - Nov 12 2023
Event2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023 - Denver, United States
Duration: Nov 12 2023Nov 17 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023
Country/TerritoryUnited States
CityDenver
Period11/12/2311/17/23

Funding

This research was sponsored by and used resources of the Oak Ridge Leadership Computing Facility (OLCF), which is a DOE Office of Science User Facility at the Oak Ridge National Laboratory supported by the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

Keywords

  • high performance computing
  • message queues
  • multi-tiered parallel file system
  • policy engine
  • storage and data management

Fingerprint

Dive into the research topics of '"PoliMOR: A Policy Engine \"Made-to-Order\" for Automated and Scalable Data Management in Lustre"'. Together they form a unique fingerprint.

Cite this