TY - GEN
T1 - Julia as a unifying end-to-end workflow language on the Frontier exascale system
AU - Godoy, William F.
AU - Valero-Lara, Pedro
AU - Anderson, Caira
AU - Lee, Katrina W.
AU - Gainaru, Ana
AU - Ferreira Da Silva, Rafael
AU - Vetter, Jeffrey S.
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/11/12
Y1 - 2023/11/12
N2 - We evaluate Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy's first exascale supercomputer. We evaluate the performance, scaling, and trade-offs of (i) the computational kernel on AMD's MI250x GPUs, (ii) weak scaling up to 4,096 MPI processes/GPUs or 512 nodes, (iii) parallel I/O writes using the ADIOS2 library bindings, and (iv) Jupyter Notebooks for interactive analysis. Results suggest that although Julia generates a reasonable LLVM-IR, a nearly 50% performance difference exists vs. native AMD HIP stencil codes when running on the GPUs. As expected, we observed near-zero overhead when using MPI and parallel I/O bindings for system-wide installed implementations. Consequently, Julia emerges as a compelling high-performance and high-productivity workflow composition language, as measured on the fastest supercomputer in the world.
AB - We evaluate Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy's first exascale supercomputer. We evaluate the performance, scaling, and trade-offs of (i) the computational kernel on AMD's MI250x GPUs, (ii) weak scaling up to 4,096 MPI processes/GPUs or 512 nodes, (iii) parallel I/O writes using the ADIOS2 library bindings, and (iv) Jupyter Notebooks for interactive analysis. Results suggest that although Julia generates a reasonable LLVM-IR, a nearly 50% performance difference exists vs. native AMD HIP stencil codes when running on the GPUs. As expected, we observed near-zero overhead when using MPI and parallel I/O bindings for system-wide installed implementations. Consequently, Julia emerges as a compelling high-performance and high-productivity workflow composition language, as measured on the fastest supercomputer in the world.
KW - Frontier supercomputer
KW - HPC
KW - High-Performance Computing
KW - Julia
KW - Jupyter notebooks
KW - data analysis
KW - end-to-end workflows
KW - exascale
UR - http://www.scopus.com/inward/record.url?scp=85178131216&partnerID=8YFLogxK
U2 - 10.1145/3624062.3624278
DO - 10.1145/3624062.3624278
M3 - Conference contribution
AN - SCOPUS:85178131216
T3 - ACM International Conference Proceeding Series
SP - 1989
EP - 1999
BT - Proceedings of 2023 SC Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023
PB - Association for Computing Machinery
T2 - 2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023
Y2 - 12 November 2023 through 17 November 2023
ER -