Abstract
In high-performance computing (HPC), scientific applications often manage a massive amount of data using I/O libraries. These libraries provide convenient data model abstractions, help ensure data portability, and, most important, empower end users to improve I/O performance by tuning configurations across multiple layers of the HPC I/O stack. We propose SCTuner, an autotuner integrated within the I/O library itself to dynamically tune both the I/O library and the underlying I/O stack at application runtime. To this end, we introduce a statistical benchmarking method to profile the behaviors of individual supercomputer I/O subsystems with varied configurations across I/O layers. We use the benchmarking results as the built-in knowledge in SCTuner, implement an I/O pattern extractor, and plan to implement an online performance tuner as the SCTuner runtime. We conducted a benchmarking analysis on the Summit supercomputer and its GPFS file system Alpine. The preliminary results show that our method can effectively extract the consistent I/O behaviors of the target system under production load, building the base for I/O autotuning at application runtime.
Original language | English |
---|---|
Title of host publication | Proceedings of PDSW 2021 |
Subtitle of host publication | IEEE/ACM 6th International Parallel Data Systems Workshop, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 29-34 |
Number of pages | 6 |
ISBN (Electronic) | 9781665418379 |
DOIs | |
State | Published - 2021 |
Externally published | Yes |
Event | 6th IEEE/ACM International Parallel Data Systems Workshop, PDSW 2021 - St. Louis, United States Duration: Nov 15 2021 → … |
Publication series
Name | Proceedings of PDSW 2021: IEEE/ACM 6th International Parallel Data Systems Workshop, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 6th IEEE/ACM International Parallel Data Systems Workshop, PDSW 2021 |
---|---|
Country/Territory | United States |
City | St. Louis |
Period | 11/15/21 → … |
Funding
ACKNOWLEDGMENT This research is supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. This work was supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357.This work used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525 (SAND2021-12186 C). Sudarsun Kannan was partially supported by NSF CNS 1850297 award. This material is based upon work supported by the U.S. Department of Energy , Office of Science, under contract DE-AC02-06CH11357.