Abstract
Heterogeneous systems are becoming increasingly prevalent. In order to exploit the rich compute resources of such systems, robust programming models are needed for application developers to seamlessly migrate legacy code from today’s systems to tomorrow’s. Over the past decade and more, directives have been established as one of the promising paths to tackle programmatic challenges on emerging systems. This work focuses on applying and demonstrating OpenMP offloading directives on five proxy applications. We observe that the performance varies widely from one compiler to the other; a crucial aspect of our work is reporting best practices to application developers who use OpenMP offloading compilers. While some issues can be worked around by the developer, there are other issues that must be reported to the compiler vendors. By restructuring OpenMP offloading directives, we gain an 18x speedup for the su3 proxy application on NERSC’s Cori system when using the Clang compiler, and a 15.7x speedup by switching max reductions to add reductions in the laplace mini-app when using the Cray-llvm compiler on Cori.
Original language | English |
---|---|
Title of host publication | Accelerator Programming Using Directives - 7th International Workshop, WACCPD 2020, Proceedings |
Editors | Sridutt Bhalachandra, Sandra Wienke, Sunita Chandrasekaran, Guido Juckeland |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 25-44 |
Number of pages | 20 |
ISBN (Print) | 9783030742232 |
DOIs | |
State | Published - 2021 |
Event | 7th International Workshop on Accelerator Programming using Directives, WACCPD 2020 - Virtual, Online Duration: Nov 20 2020 → Nov 20 2020 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12655 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 7th International Workshop on Accelerator Programming using Directives, WACCPD 2020 |
---|---|
City | Virtual, Online |
Period | 11/20/20 → 11/20/20 |
Funding
Acknowledgements. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231. This research also used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. The authors would like to thank Doug Doerfler and Rahul Gayatri for helpful discussion about the su3 benchmark and useful research directions for this project.
Keywords
- Directive-based programming
- GPU
- Heterogeneous systems
- NVIDIA
- OpenMP
- Performance portability
- V100