Abstract
Popular parallel I/O libraries, such as HDF5, provide tuning parameters to obtain superior performance. However, the selection of effective parameters on production systems is complex due to the interdependence of I/O software and file system layers. Hence, application developers typically use the default parameters and often experience poor I/O performance. This work conducts a benchmarking-based analysis on the HDF5 behaviors with a wide variety of I/O patterns to extract performance characteristics under the production workload. To make the analysis well controlled, we exercise I/O benchmarks on POSIX-IO, MPI-IO, and HDF5 using the same I/O patterns and in the same jobs. To address high performance variability in production environments, we repeat the benchmarks across I/O patterns, storage devices, and time intervals. Based on the results, we identified consistent HDF5 behaviors that appropriate configurations and operations on dataset layout and file-metadata placement can improve performance significantly. We apply our findings and evaluate the tuned I/O library on two supercomputers: Summit and Cori. The results show that our tuned parameters can achieve more than 10× I/O performance speedup than that with default parameters on both systems, suggesting the effectiveness, stability, and generality of our solution.
Original language | English |
---|---|
Title of host publication | Proceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 |
Editors | Laurent Lefevre, Stacy Patterson, Young Choon Lee, Haiying Shen, Shashikant Ilager, Mohammad Goudarzi, Adel N. Toosi, Rajkumar Buyya |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 51-60 |
Number of pages | 10 |
ISBN (Electronic) | 9781728195865 |
DOIs | |
State | Published - May 2021 |
Event | 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 - Virtual, Melbourne, Australia Duration: May 10 2021 → May 13 2021 |
Publication series
Name | Proceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 |
---|
Conference
Conference | 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 |
---|---|
Country/Territory | Australia |
City | Virtual, Melbourne |
Period | 05/10/21 → 05/13/21 |
Funding
the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. This work used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725 and resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. ACKNOWLEDGMENT This research is supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of