Abstract
Scientific domains ranging from bioinformatics to astronomy and earth science rely on traditional high-performance computing (HPC) codes, often encapsulated in scientific workflows. In contrast to traditional HPC codes that employ a few programming and runtime approaches that are highly optimized for HPC platforms, scientific workflows are not necessarily optimized for these platforms. As an effort to reduce the gap between compute and I/O performance, HPC platforms have adopted intermediate storage layers known as burst buffers. A burst buffer (BB) is a fast storage layer positioned between the global parallel file system and the compute nodes. Two designs currently exist: (i) shared, where the BBs are located on dedicated nodes; and (ii) on-node, in which each compute node embeds a private BB. In this paper, using accurate simulations and realworld experiments, we study how to best use these new storage layers when executing scientific workflows. These applications are not necessarily optimized to run on HPC systems, and thus can exhibit I/O patterns that differ from that of HPC codes. Thus, we first characterize the I/O behaviors of a real-world workflow under different configuration scenarios on two leadership-class HPC systems (Cori at NERSC and Summit at ORNL). Then, we use these characterizations to calibrate a simulator for workflow executions on HPC systems featuring shared and private BBs. Last, we evaluate our approach against a large I/O-intensive workflow, and we provide insights on the performance levels and the potential limitations of these two BBs architectures.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2020 IEEE International Conference on Cluster Computing, CLUSTER 2020 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 92-103 |
| Number of pages | 12 |
| ISBN (Electronic) | 9781728166773 |
| DOIs | |
| State | Published - Sep 2020 |
| Externally published | Yes |
| Event | 22nd IEEE International Conference on Cluster Computing, CLUSTER 2020 - Kobe, Japan Duration: Sep 14 2020 → Sep 17 2020 |
Publication series
| Name | Proceedings - IEEE International Conference on Cluster Computing, ICCC |
|---|---|
| Volume | 2020-September |
| ISSN (Print) | 1552-5244 |
Conference
| Conference | 22nd IEEE International Conference on Cluster Computing, CLUSTER 2020 |
|---|---|
| Country/Territory | Japan |
| City | Kobe |
| Period | 09/14/20 → 09/17/20 |
Funding
This work is funded by DOE contract number #DESC0012636; and partly funded by NSF contracts #1664162, #1741040, #1923539, and #1923621. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. We would also like to thank C. Daley and L. Ramakrishnan from NERSC for their help with the SWarp workflow.
Keywords
- Burst buffers
- High-Performance Computing
- Performance Characterization
- Scientific workflow
- Simulations
Fingerprint
Dive into the research topics of 'Modeling the Performance of Scientific Workflow Executions on HPC Platforms with Burst Buffers'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver