Abstract
A growing disparity between supercomputer computation speeds and I/O rates means that it is rapidly becoming infeasible to analyze supercomputer application output only after that output has been written to a file system. Instead, data-generating applications must run concurrently with data reduction and/or analysis operations, with which they exchange information via high-speed methods such as interprocess communications. The resulting parallel computing motif, online data analysis and reduction (ODAR), has important implications for both application and HPC systems design. Here we introduce the ODAR motif and its co-design concerns, describe a co-design process for identifying and addressing those concerns, present tools that assist in the co-design process, and present case studies to illustrate the use of the process and tools in practical settings.
Original language | English |
---|---|
Pages (from-to) | 617-635 |
Number of pages | 19 |
Journal | International Journal of High Performance Computing Applications |
Volume | 35 |
Issue number | 6 |
DOIs | |
State | Published - Nov 2021 |
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This article reports on work supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and National Nuclear Security Administration. This research used resources of the Argonne and Oak Ridge Leadership Computing Facilities and NERSC, DOE Office of Science User Facilities supported under Contracts DE-AC02-06CH11357, DE-AC05-00OR22725, and DE-AC02-05CH11231, respectively.
Keywords
- Data analysis
- exascale computing
- in situ
- online data analysis and reduction