Accelerating Application Bulk Synchronous Writes in HPC Environments

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

High-bandwidth storage tiers are becoming more common for their capability to absorb high-rate, bursty I/Os. Notably, the designs of these fast storage tiers differ from system to system. The variation of these layers and non-uniform methods of access can pose challenges for applications seeking to run at multiple HPC facilities. Therefore, in this work, we present Spectral, a rapid-output abstraction library to accelerate application, bulk-synchronous writes on HPC systems. We design Spectral to enable applications to use high-bandwidth storage, such as node-local storage and distributed, write-caches (e.g., burst buffers) transparently without requiring modifications to the application or file system source code. The key idea is to allow applications to spend most of the time performing productive work and to not require any source code changes for maximum portability on different HPC architectures. Spectral internally re-routes write-only files through available, high-performance I/O resources before ultimately migrating them to the shared global parallel file system. For instance, on Summit, Spectral transparently places application outputs on node-local storage and then utilizes asynchronous migration to the center-wide GPFS file system. We evaluate Spectral on the Summit HPC system (1024 nodes) using the IOR benchmark and real scientific applications. Spectral shows linear performance scaling, improving application write performance by over an order of magnitude when compared to GPFS.

Original languageEnglish
Title of host publicationSNTA 2024 - Proceedings of the 2024 7th International Workshop on Systems and Network Telemetry and Analytics, Part of
Subtitle of host publicationHPDC 2024 - 33rd International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery, Inc
Pages7-14
Number of pages8
ISBN (Electronic)9798400706486
DOIs
StatePublished - Jun 3 2024
Event7th International Workshop on Systems and Network Telemetry and Analytics, SNTA 2024 - Pisa, Italy
Duration: Jun 3 2024Jun 7 2024

Publication series

NameSNTA 2024 - Proceedings of the 2024 7th International Workshop on Systems and Network Telemetry and Analytics, Part of: HPDC 2024 - 33rd International Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference7th International Workshop on Systems and Network Telemetry and Analytics, SNTA 2024
Country/TerritoryItaly
CityPisa
Period06/3/2406/7/24

Fingerprint

Dive into the research topics of 'Accelerating Application Bulk Synchronous Writes in HPC Environments'. Together they form a unique fingerprint.

Cite this