Abstract
This paper describes how the OpenACC data model is implemented in current OpenACC compilers, ranging from research compilers (OpenUH and OpenARC) to a commercial compiler (the PGI OpenACC compiler). First, we summarize various memory architectures in today's accelerator systems. We then describe details and issues in implementing the OpenACC data model in three different OpenACC compilers. This includes managing page tables, asynchronous data transfers, asynchronous memory allocate and free, host data construct, aliasing on a data directive, reusing device memory, partially present data, and adjacent data. We also discusses ongoing work to manage large, complex dynamic data structures. We measured the present table lookups, device memory allocation, pinned memory allocation, and managed memory in the three OpenACC compilers using eight OpenACC applications (seven from the SPEC ACCEL benchmark suite and a shock-hydrodynamics mini-application called LULESH).
Original language | English |
---|---|
Pages (from-to) | 15-27 |
Number of pages | 13 |
Journal | Parallel Computing |
Volume | 78 |
DOIs | |
State | Published - Oct 2018 |
Funding
This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research. This research was supported in part by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.
Funders | Funder number |
---|---|
U.S. Department of Energy Office of Science | |
U.S. Department of Energy | |
Office of Science | |
National Nuclear Security Administration | |
Advanced Scientific Computing Research | 17-SC-20-SC |
Keywords
- Accelerators
- Compiler implementations
- Data model
- OpenACC