Access Patterns and Performance Behaviors of Multi-layer Supercomputer I/O Subsystems under Production Load

Jean Luca Bez, Ahmad Maroof Karimi, Arnab K. Paul, Bing Xie, Suren Byna, Philip Carns, Sarp Oral, Feiyi Wang, Jesse Hanley

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

16 Scopus citations

Abstract

Scientific computing workloads at HPC facilities have been shifting from traditional numerical simulations to AI/ML applications for training and inference while processing and producing ever-increasing amounts of scientific data. To address the growing need for increased storage capacity, lower access latency, and higher bandwidth, emerging technologies such as non-volatile memory are integrated into supercomputer I/O subsystems. With these emerging trends, we need a better understanding of the multilayer supercomputer I/O systems and ways to use these subsystems efficiently. In this work, we study the I/O access patterns and performance characteristics of two representative supercomputer I/O subsystems. Through an extensive analysis of year-long I/O logs on each system, we report new observations in I/O reads and writes, unbalanced use of storage system layers, and new trends in user behaviors at the HPC I/O middleware stack.

Original languageEnglish
Title of host publicationHPDC 2022 - Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery, Inc
Pages43-55
Number of pages13
ISBN (Electronic)9781450391993
DOIs
StatePublished - Jun 27 2022
Event31st International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2022 - Virtual, Online, United States
Duration: Jun 27 2022Jun 30 2022

Publication series

NameHPDC 2022 - Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference31st International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2022
Country/TerritoryUnited States
CityVirtual, Online
Period06/27/2206/30/22

Funding

This work used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research used resources of the National Energy Research Scientific Computing Center under Contract No. DE-AC02-05CH11231. This work was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357.

Keywords

  • access patterns
  • high-performance computing
  • in-system storage
  • parallel file systems
  • production system

Fingerprint

Dive into the research topics of 'Access Patterns and Performance Behaviors of Multi-layer Supercomputer I/O Subsystems under Production Load'. Together they form a unique fingerprint.

Cite this