Skip to main navigation Skip to search Skip to main content

Filling Performance Portability and High-Productivity Gaps for Scientific Applications with Julia and JACC

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The performance gap in high-productivity languages such as Python, Julia, R, and Matlab is still an open research area in which significant investments are being made. However, the challenge of writing code once that performs efficiently across disparate hardware is made even greater by architectural heterogeneity. Our work focuses on JACC [3]1, a performance portable CPU/GPU library implementation for the scientific LLVM-based Julia programming language. Similar to Kokkos and SYCL in C++, JACC provides a simple high-level metaprogramming API for Julia. JACC leverages CPU and GPU vendor-specific backends, and thus its proposition is to reuse rather than reinvent current investments in the language. In addition, JACC is complementary to existing performance-portable solutions in Julia, such as KernelAbstractions.jl [2], which targets mainly GPUs using a fine-granularity programming model, similar to CUDA and HIP. Due to its novelty, JACC must still be evaluated across multidisciplinary science domains and heterogeneous hardware architectures.Julia is no different from other programming languages in facing performance-portability challenges. Currently, Julia's programming models tend to closely follow vendor layers, which could still be too low level, thereby hindering programming productivity. JACC addresses this challenge for Julia programmers and applications, providing a high-level API for CPU and GPU hardware, which could potentially be extended to other architectures (e.g., AI custom hardware, field-programmable gate arrays) and configurations (e.g., distributed memory, multidevice use).Our poster presents a comprehensive view of JACC for productive scientific computing. First, we describe the JACC model divided into two main components: (i) portable memory and (ii) kernel launching. JACC architecture is then described showing how it leverages four backends inside JACC, so far on top of Julia's Base Threads, CUDA, AMDGPU, and OneAPI to target multi-core CPUs, and NVIDIA, AMD, and Intel GPUs, respectively.The performance of JACC is evaluated on widely used science application workloads: Hartree-Fock, XSBench, miniBUDE, BabelStream, and LULESH on recent CPUs and NVIDIA, AMD, and Intel GPUs where possible. Our results show that JACC is competitive and can achieve performance similar to that of OpenMP, Kokkos, OpenCL, SYCL, and CUDA/HIP baseline implementations on C++, C, and Fortran for several workloads, though gaps still need to be understood for specific corner cases. We also provide a roadmap for JACC's support of multiple-GPU, shared memory and future backends (e.g. Apple's Metal for GPU) support. JACC and Julia is a strong paradigm for HPC [1] for developing performance-portable codes at a fraction of the cost.

Original languageEnglish
Title of host publication54th International Conference on Parallel Processing, ICPP 2025 - Workshops Proceedings
PublisherAssociation for Computing Machinery, Inc
Pages191-192
Number of pages2
ISBN (Electronic)9798400721090
DOIs
StatePublished - Dec 20 2025
Event54th International Conference on Parallel Processing Workshop, ICPP 2025 - San Diego, United States
Duration: Sep 8 2025Sep 11 2025

Publication series

Name54th International Conference on Parallel Processing, ICPP 2025 - Workshops Proceedings

Conference

Conference54th International Conference on Parallel Processing Workshop, ICPP 2025
Country/TerritoryUnited States
CitySan Diego
Period09/8/2509/11/25

Funding

This work was supported in part by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific ComputingResearch's Computer Science Competitive Portfolios, MAGMA/-Fairbanks project, and the Next Generation of Scientific Software Technologies, S4PST and PESO projects, programs. This research used resources of the Oak Ridge Leadership Computing Facility and the Experimental Computing Laboratory at the Oak Ridge National Laboratory, which is supported by the Office of Science of the US Department of Energy under Contract No. DE-AC05-00OR22725.

Keywords

  • HPC
  • JACC
  • Julia
  • high-productivity languages
  • performance portability
  • scientific computing

Fingerprint

Dive into the research topics of 'Filling Performance Portability and High-Productivity Gaps for Scientific Applications with Julia and JACC'. Together they form a unique fingerprint.

Cite this