Battle of the defaults: Extracting performance characteristics of HDF5 under production load

Bing Xie, Houjun Tang, Suren Byna, Jesse Hanley, Quincey Koziol, Tonglin Li, Sarp Oral

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Popular parallel I/O libraries, such as HDF5, provide tuning parameters to obtain superior performance. However, the selection of effective parameters on production systems is complex due to the interdependence of I/O software and file system layers. Hence, application developers typically use the default parameters and often experience poor I/O performance. This work conducts a benchmarking-based analysis on the HDF5 behaviors with a wide variety of I/O patterns to extract performance characteristics under the production workload. To make the analysis well controlled, we exercise I/O benchmarks on POSIX-IO, MPI-IO, and HDF5 using the same I/O patterns and in the same jobs. To address high performance variability in production environments, we repeat the benchmarks across I/O patterns, storage devices, and time intervals. Based on the results, we identified consistent HDF5 behaviors that appropriate configurations and operations on dataset layout and file-metadata placement can improve performance significantly. We apply our findings and evaluate the tuned I/O library on two supercomputers: Summit and Cori. The results show that our tuned parameters can achieve more than 10× I/O performance speedup than that with default parameters on both systems, suggesting the effectiveness, stability, and generality of our solution.

Original languageEnglish
Title of host publicationProceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021
EditorsLaurent Lefevre, Stacy Patterson, Young Choon Lee, Haiying Shen, Shashikant Ilager, Mohammad Goudarzi, Adel N. Toosi, Rajkumar Buyya
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages51-60
Number of pages10
ISBN (Electronic)9781728195865
DOIs
StatePublished - May 2021
Event21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 - Virtual, Melbourne, Australia
Duration: May 10 2021May 13 2021

Publication series

NameProceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021

Conference

Conference21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021
Country/TerritoryAustralia
CityVirtual, Melbourne
Period05/10/2105/13/21

Funding

the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. This work used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725 and resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. ACKNOWLEDGMENT This research is supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of

FundersFunder number
U.S. Department of EnergyDE-AC02-05CH11231
Office of ScienceDE-AC05-00OR22725
Advanced Scientific Computing Research
National Energy Research Scientific Computing Center

    Fingerprint

    Dive into the research topics of 'Battle of the defaults: Extracting performance characteristics of HDF5 under production load'. Together they form a unique fingerprint.

    Cite this