Fresco: A Public Multi-Institutional Dataset for Understanding HPC System Behavior and Dependability

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The scarcity of publicly available operational data from High Performance Computing (HPC) systems hinders research in critical areas like system dependability and resource optimization. While recent efforts, such as the Atlas project, have increased the availability of cluster traces, these datasets often lack fine-grained operational metrics linked to comprehensive job-level attributes across diverse environments. To address this gap, we introduce Fresco, a dataset containing data from 20.9 million jobs spanning 75 months collected from three major academic supercomputing clusters: Purdue’s Anvil and Conte systems, and Texas Advanced Computing Center’s Stampede. Fresco uniquely captures six key performance metrics alongside many job-level attributes such as resource allocations and execution outcomes. We detail our data integration process that transforms and standardizes the heterogeneous data sources into a consistent format. The resulting dataset enables researchers to investigate the relationships between job characteristics, resource consumption patterns, and system performance in academic HPC environments. We make this resource open source at https://www.frescodata.xyz. Our expectation is that this public release will facilitate research and operational improvements that had previously been impossible due to the unavailability of such data.

Original languageEnglish
Title of host publicationPEARC 2025 - Practice and Experience in Advanced Research Computing 2025
Subtitle of host publicationThe Power of Collaboration
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400713989
DOIs
StatePublished - Jul 18 2025
Externally publishedYes
Event2025 Practice and Experience in Advanced Research Computing, PEARC 2025 - Columbus, United States
Duration: Jul 20 2025Jul 24 2025

Publication series

NamePEARC 2025 - Practice and Experience in Advanced Research Computing 2025: The Power of Collaboration

Conference

Conference2025 Practice and Experience in Advanced Research Computing, PEARC 2025
Country/TerritoryUnited States
CityColumbus
Period07/20/2507/24/25

Funding

This work was ably supported by Carol Song of Purdue’s Information Technology Department, Stephen Harrell of the Texas Advanced Computing Center (TACC). This material is based in part upon work supported by the National Science Foundation under Grant Numbers CNS-2016704 and CCF-2140139. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the sponsor.

Keywords

  • Computer system dependability
  • Computer system usage
  • Data repository

Fingerprint

Dive into the research topics of 'Fresco: A Public Multi-Institutional Dataset for Understanding HPC System Behavior and Dependability'. Together they form a unique fingerprint.

Cite this