Mojo: MLIR-based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We explore the performance and portability of the novel Mojo language for scientific computing workloads on GPUs. As the first language based on the LLVM’s Multi-Level Intermediate Representation (MLIR) compiler infrastructure, Mojo aims to close performance and productivity gaps by combining Python’s interoperability and CUDA-like syntax for compile-time portable GPU programming. We target four scientific workloads: a seven-point stencil (memory-bound), BabelStream (memory-bound), miniBUDE (compute-bound), and Hartree-Fock (compute-bound with atomic operations); and compare their performance against vendor baselines on NVIDIA H100 and AMD MI300A GPUs. We show that Mojo’s performance is competitive with CUDA and HIP for memory-bound kernels, whereas gaps exist on AMD GPUs for atomic operations and for fast-math compute-bound kernels on both AMD and NVIDIA GPUs. Although the learning curve and programming requirements are still fairly low-level, Mojo can close significant gaps in the fragmented Python ecosystem in the convergence of scientific computing and AI.

Original languageEnglish
Title of host publicationProceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops
PublisherAssociation for Computing Machinery, Inc
Pages2114-2128
Number of pages15
ISBN (Electronic)9798400718717
DOIs
StatePublished - Nov 15 2025
Event2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops - St. Louis, United States
Duration: Nov 16 2025Nov 21 2025

Publication series

NameProceedings of 2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops

Conference

Conference2025 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC 2025 Workshops
Country/TerritoryUnited States
CitySt. Louis
Period11/16/2511/21/25

Funding

This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan https://www.energy.gov/doe-public-access-plan. This work was supported in part by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research’s Computer Science Competitive Portfolios program, MAGMA/Fairbanks project; and the Next Generation of Scientific Software Technologies program, PESO and S4PST projects. This research used resources of the Oak Ridge Leadership Computing Facility and the Experimental Computing Laboratory at the Oak Ridge National Laboratory, which are supported by the Office of Science of the US Department of Energy under Contract No. DE-AC05-00OR22725. This work was supported in part by the U.S. Department of Energy, Office of Science, Office of Workforce Development for Teachers and Scientists (WDTS) under the Science Undergraduate Laboratory Internships Program (SULI). WFG and TM would like to thank Alex Smith from the University of Wisconsin-Madison for providing the CUDA and HIP ports of Hartree-Fock.

Keywords

  • GPU
  • HPC
  • Mojo
  • Python
  • performance portability
  • productivity
  • science kernels

Fingerprint

Dive into the research topics of 'Mojo: MLIR-based Performance-Portable HPC Science Kernels on GPUs for the Python Ecosystem'. Together they form a unique fingerprint.

Cite this