TY - GEN
T1 - Scientific User Behavior and Data-Sharing Trends in a Petascale File System
AU - Lim, Seung Hwan
AU - Sim, Hyogi
AU - Gunasekaran, Raghul
AU - Vazhkudai, Sudharshan S.
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017
Y1 - 2017
N2 - The Oak Rrdge Leadership Computing Facility (OLCF) runs the No. 4 supercomputer in the world, supported by a petascale file system, to facilitate scientific discovery. In this paper, using the daily file system metadata snapshots collected over 500 days, we have studied the behavioral trends of 1,362 active users and 380 projects across 35 science domains. In particular, we have analyzed both individual and collective behavior of users and projects, highlighting needs from individual communities and the overall requirements to operate the file system. We have analyzed the metadata across three dimensions, namely (i) the projects' file generation and usage trends, using quantitative file system-centric metrics, (ii) scientific user behavior on the file system, and (iii) the data sharing trends of users and projects. To the best of our knowledge, our work is the first of its kind to provide comprehensive insights on user behavior from multiple science domains through metadata analysis of a large-scale shared file system. We envision that this OLCF case study will provide valuable insights for the design, operation, and management of storage systems at scale, and also encourage other HPC centers to undertake similar such efforts.CCS
AB - The Oak Rrdge Leadership Computing Facility (OLCF) runs the No. 4 supercomputer in the world, supported by a petascale file system, to facilitate scientific discovery. In this paper, using the daily file system metadata snapshots collected over 500 days, we have studied the behavioral trends of 1,362 active users and 380 projects across 35 science domains. In particular, we have analyzed both individual and collective behavior of users and projects, highlighting needs from individual communities and the overall requirements to operate the file system. We have analyzed the metadata across three dimensions, namely (i) the projects' file generation and usage trends, using quantitative file system-centric metrics, (ii) scientific user behavior on the file system, and (iii) the data sharing trends of users and projects. To the best of our knowledge, our work is the first of its kind to provide comprehensive insights on user behavior from multiple science domains through metadata analysis of a large-scale shared file system. We envision that this OLCF case study will provide valuable insights for the design, operation, and management of storage systems at scale, and also encourage other HPC centers to undertake similar such efforts.CCS
KW - Distributed file systems
KW - Usage measurement
UR - http://www.scopus.com/inward/record.url?scp=85142280133&partnerID=8YFLogxK
U2 - 10.1145/3126908.3126924
DO - 10.1145/3126908.3126924
M3 - Conference contribution
AN - SCOPUS:85142280133
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - SC 2017 - International Conference for High Performance Computing, Networking, Storage and Analysis
PB - IEEE Computer Society
T2 - 2017 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017
Y2 - 12 November 2017 through 17 November 2017
ER -