Abstract
Some of the most important categories of performance events count the data traffic between the processing cores and the main memory. However, since these counters are not core-private, applications require elevated privileges to access them. PAPI offers a component that can access this information on IBM systems through the Performance Co-Pilot (PCP); however, doing so adds an indirection layer that involves querying the PCP daemon. This paper performs a quantitative study of the accuracy of the measurements obtained through this component on the Summit supercomputer. We use two linear algebra kernels - a generalized matrix multiply, and a modified matrix-vector multiply - as benchmarks and a distributed, GPU-accelerated 3D-FFT mini-app (using cuFFT) to compare the measurements obtained through the PAPI PCP component against the expected values across different problem sizes. We also compare our measurements against an in-house machine with a very similar architecture to Summit, where elevated privileges allow PAPI to access the hardware counters directly (without using PCP) to show that measurements taken via PCP are as accurate as the those taken directly. Finally, using both QMCPACK and the 3D-FFT, we demonstrate the diverse hardware activities that can be monitored simultaneously via PAPI hardware components.
Original language | English |
---|---|
Title of host publication | 2023 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2023 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 393-402 |
Number of pages | 10 |
ISBN (Electronic) | 9798350311990 |
DOIs | |
State | Published - 2023 |
Externally published | Yes |
Event | 2023 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2023 - St. Petersburg, United States Duration: May 15 2023 → May 19 2023 |
Publication series
Name | 2023 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2023 |
---|
Conference
Conference | 2023 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2023 |
---|---|
Country/Territory | United States |
City | St. Petersburg |
Period | 05/15/23 → 05/19/23 |
Funding
ACKNOWLEDGMENT We thank the anonymous reviewers for their improvement suggestions. This research was supported in part by the Exas-cale Computing Project (17-SC-20-SC), a collaboratvi e effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration; and by the National Science Foundation under award No. 1900888 “ANACIN-X.”
Keywords
- GPU power
- PAPI
- high performance computing
- memory bandwidth
- network traffic
- performance analysis
- performance counters