Analyzing File Access Patterns on Large-Scale HPC Systems: Opportunities for File Prefetching

Ahmad Maroof Karimi, Arnab K. Paul, Jong Youl Choi, Lipeng Wan, Feiyi Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper explores the potential opportunities for implementing file prefetching techniques on large-scale high-performance computing (HPC) systems. Specifically, we investigate the file access patterns of various applications across multiple scientific domains using two years' worth of Darshan I/O traces obtained from the Summit supercomputer. We identify recurring trends and patterns which indicate that prefetching can be effectively leveraged to improve data access performance on HPC systems. This study serves as a valuable reference for system architects and developers in the HPC community, providing insights into the opportunities and challenges associated with enabling file prefetching on large-scale HPC systems.

Original languageEnglish
Title of host publicationProceedings - 2023 31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2023
PublisherIEEE Computer Society
ISBN (Electronic)9798350319484
DOIs
StatePublished - 2023
Event31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2023 - Stony Brook, United States
Duration: Oct 16 2023Oct 18 2023

Publication series

NameProceedings - IEEE Computer Society's Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, MASCOTS
ISSN (Print)1526-7539

Conference

Conference31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, MASCOTS 2023
Country/TerritoryUnited States
CityStony Brook
Period10/16/2310/18/23

Funding

ACKNOWLEDGMENT This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Notice: This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (https://www.energy.gov/doe-public-access-plan).

FundersFunder number
U.S. Department of EnergyDE-AC05-00OR22725
Office of Science

    Keywords

    • Darshan
    • I/O Prefetching
    • Parallel File System

    Fingerprint

    Dive into the research topics of 'Analyzing File Access Patterns on Large-Scale HPC Systems: Opportunities for File Prefetching'. Together they form a unique fingerprint.

    Cite this