TY - GEN
T1 - Performance Analysis of a Quantum Monte Carlo Application on Multiple Hardware Architectures Using the HPX Runtime
AU - Wei, Weile
AU - Chatterjee, Arghya
AU - Huck, Kevin
AU - Hernandez, Oscar
AU - Kaiser, Hartmut
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/11
Y1 - 2020/11
N2 - This paper describes how we successfully used the HPX programming model to port the DCA++ application on multiple architectures that include POWER9, x86, ARM v8, and NVIDIA GPUs. We describe the lessons we can learn from this experience as well as the benefits of enabling the HPX in the application to improve the CPU threading part of the code, which led to an overall 21% improvement across architectures. We also describe how we used HPX-APEX to raise the level of abstraction to understand performance issues and to identify tasking optimization opportunities in the code, and how these relate to CPU/GPU utilization counters, device memory allocation over time, and CPU kernel level context switches on a given architecture.
AB - This paper describes how we successfully used the HPX programming model to port the DCA++ application on multiple architectures that include POWER9, x86, ARM v8, and NVIDIA GPUs. We describe the lessons we can learn from this experience as well as the benefits of enabling the HPX in the application to improve the CPU threading part of the code, which led to an overall 21% improvement across architectures. We also describe how we used HPX-APEX to raise the level of abstraction to understand performance issues and to identify tasking optimization opportunities in the code, and how these relate to CPU/GPU utilization counters, device memory allocation over time, and CPU kernel level context switches on a given architecture.
KW - Autonomic Performance Environment for eXascale (APEX)
KW - Dynamical Cluster Approximation (DCA)
KW - HPX runtime system
KW - Quantum Monte Carlo (QMC)
UR - http://www.scopus.com/inward/record.url?scp=85101228392&partnerID=8YFLogxK
U2 - 10.1109/ScalA51936.2020.00015
DO - 10.1109/ScalA51936.2020.00015
M3 - Conference contribution
AN - SCOPUS:85101228392
T3 - Proceedings of ScalA 2020: 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 77
EP - 84
BT - Proceedings of ScalA 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 11th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2020
Y2 - 13 November 2020 through 13 November 2020
ER -