TY - GEN
T1 - Integer Sum Reduction with OpenMP on an AMD MI100 GPU
AU - Jin, Zheming
AU - Vetter, Jeffrey S.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Sum reduction is a primitive operation in parallel computing. Device offload support allows a user to use OpenMP directives to take advantage of a highly capable GPU. In this paper, we present the integer sum reduction annotated with the OpenMP directives and evaluate the performance impacts of tunable parameters with the AOMP and GCC compilers on an AMD MI100 GPU. In addition, we explain the implementations of the OpenMP reduction by the compilers. Sweeping over the pruned parameter space, we find that the speedup is approximately 20 with AOMP, and the reduction performance using AOMP is approximately 11% higher than that using GCC. However, the OpenMP offload performance is approximately 30% lower compared to the performance of the reductions written with rocThrust or hipCUB.
AB - Sum reduction is a primitive operation in parallel computing. Device offload support allows a user to use OpenMP directives to take advantage of a highly capable GPU. In this paper, we present the integer sum reduction annotated with the OpenMP directives and evaluate the performance impacts of tunable parameters with the AOMP and GCC compilers on an AMD MI100 GPU. In addition, we explain the implementations of the OpenMP reduction by the compilers. Sweeping over the pruned parameter space, we find that the speedup is approximately 20 with AOMP, and the reduction performance using AOMP is approximately 11% higher than that using GCC. However, the OpenMP offload performance is approximately 30% lower compared to the performance of the reductions written with rocThrust or hipCUB.
KW - AMD GPU
KW - OpenMP target offload
KW - Reduction
UR - http://www.scopus.com/inward/record.url?scp=85136160970&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW55747.2022.00088
DO - 10.1109/IPDPSW55747.2022.00088
M3 - Conference contribution
AN - SCOPUS:85136160970
T3 - Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022
SP - 496
EP - 499
BT - Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 36th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022
Y2 - 30 May 2022 through 3 June 2022
ER -