Collection of Disk Failure Events from Alpine, the Parallel File System for Summit Supercomputer

Dataset

Description

This dataset contains disk (HDD) failure events collected from the Alpine storage system of the Summit supercomputer, hosted at OLCF, spanning from January 4, 2019, to December 21, 2023 (a total of 4 years, 11 months, and 18 days), covering 89% of its operational lifetime. It includes 3,766 disk failure events, each recorded with its detection timestamp (in ISO 8601 format) and detailed by its location within the storage system—rack, enclosure, and drive slot number.

Funding

DE-AC05-00OR22725

Cite this