JACC.shared: Leveraging HPC Metaprogramming and Performance Portability for Computations That Use Shared Memory GPUs

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this work, we present JACC.shared, a new feature of Julia for ACCelerators (JACC), which is the performanceportable and metaprogramming model of the just-in-time and LLVM-based Julia language. This new feature allows JACC applications to leverage the high-performance computing (HPC) capabilities of high-bandwidth, on-chip GPU memory. Historically, exploiting high-bandwidth, shared-memory GPUs has not been a priority for high-level programming solutions. JACC.shared covers that gap for the first time, thereby providing a highlevel, portable, and easy-to-use solution for programmers to exploit this memory and supporting all current major accelerator architectures. Well-known HPC and AI workloads, such as multi/hyperspectral imaging and AI convolutions, have been used to evaluate JACC.shared on two exascale GPU architectures hosted by some of the most powerful US Department of Energy supercomputers: Perlmutter (NVIDIA A100) and Frontier (AMD MI250X). The performance evaluation reports speedup of up to 3.5 × by adding only one line of code to the base codes, thus providing important accelerators in a simple, portable, and transparent way and elevating the programming productivity and performance-portability capabilities for Julia/JACC HPC, AI, and scientific applications.

Original languageEnglish
Title of host publication2024 IEEE High Performance Extreme Computing Conference, HPEC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350387131
DOIs
StatePublished - 2024
Event2024 IEEE High Performance Extreme Computing Conference, HPEC 2024 - Virtual, Online
Duration: Sep 23 2024Sep 27 2024

Publication series

Name2024 IEEE High Performance Extreme Computing Conference, HPEC 2024

Conference

Conference2024 IEEE High Performance Extreme Computing Conference, HPEC 2024
CityVirtual, Online
Period09/23/2409/27/24

Funding

This research used resources of the Oak Ridge Leadership Computing Facility and the Experimental Computing Laboratory (ExCL) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the US Department of Energy under Contract No. DE-AC05-00OR22725. This research was funded in part by the DOE ASCR Stewardship for Programming Systems and Tools (S4PST) project, and by Bluestone, a X-Stack project in the DOE ASCR Office. Notice: This manuscript has been authored by UT-Battelle LLC under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (https://www.energy.gov/doe-publicaccess-plan).

Keywords

  • high-bandwidth on-chip memory
  • JACC
  • Julia
  • metaprogramming
  • performance portability

Fingerprint

Dive into the research topics of 'JACC.shared: Leveraging HPC Metaprogramming and Performance Portability for Computations That Use Shared Memory GPUs'. Together they form a unique fingerprint.

Cite this