Abstract
This paper introduces the Yet Another Kernel Launcher (YAKL) C++ portability library, which strives to enable user-level code with the look and feel of Fortran code. The intended audience includes both C++ developers and Fortran developers unfamiliar with C++. The C++ portability approach is briefly explained, YAKL’s main features are described, and code examples are given that demonstrate YAKL’s usage. YAKL fills a niche capability important particularly to scientific applications seeking to port Fortran code quickly to a portable C++ library. YAKL places heavy emphasis on simplicity, readability, and productivity with performance mainly emphasizing Graphics Processing Units (GPUs). Central to YAKL’s ability to allow Fortran-like user-level code are three features: (1) a multi-dimensional Array class that allows Fortran behavior; (2) a limited library of Fortran intrinsic functions; and (3) an efficient pool allocator that transparently enables cheap frequent allocations and deallocations of YAKL Arrays. While YAKL allows Fortran-style code, it also allows Arrays that exhibit C-like behavior as well, including row-major index ordering and lower bounds of “0”. YAKL currently supports CPUs, CPU threading, and Nvidia, AMD, and Intel GPUs.
Original language | English |
---|---|
Pages (from-to) | 209-230 |
Number of pages | 22 |
Journal | International Journal of Parallel Programming |
Volume | 51 |
Issue number | 4-5 |
DOIs | |
State | Published - Oct 2023 |
Funding
This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. The authors wish to thank Daniel Arndt from Oak Ridge National Laboratory, who helped inform best practices for using SYCL. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.
Keywords
- C++
- GPU
- HPC
- Portability