LLM-Driven Fortran-to-C/C++ Portability for Parallel Scientific Codes

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We define the fundamental practices and criteria for evaluating and using the Meta Llama 3 and OpenAI ChatGPT 3.5 and 4o large language models (LLMs) to translate parallel scientific Fortran + OpenMP and Fortran + OpenACC codes to C/C++ codes that can leverage vendor-specific libraries (CUDA, HIP) for GPU acceleration in addition to other performance-portable programming models (e.g., Kokkos, OpenMP, OpenACC). In this study, LLMs are used to translate 11 different parallel Fortran codes with some of the most popular and widely used kernels/proxies in high-performance computing (HPC): AXPY, GEMV, GEMM, Jacobi, SpMV, and the >200-line Hartree-Fock application proxy, which implements a solver for quantum many-body systems. In all, we analyze the correctness and reproducibility of more than 1,650 AI-generated parallel C/C++ codes. Additionally, we evaluate the performance of Fortran codes and AI-generated C/C++ codes on two modern HPC architectures - one AMD EPYC Rome CPU with 64 cores and one NVIDIA Ampere A100 GPU. We use multi-modal prompting and fine-tuning techniques for LLMs to produce parallel scientific C/C++ codes with high levels of correctness (more than 95% of the codes are well ported) and speedups of up to an order of magnitude versus Fortran + OpenMP and Fortran + OpenACC codes on the same system.

Original languageEnglish
Title of host publicationProceedings - 2025 IEEE International Conference on e-Science, eScience 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages385-394
Number of pages10
ISBN (Electronic)9798331591458
DOIs
StatePublished - 2025
Event21st IEEE International Conference on e-Science, eScience 2025 - Chicago, United States
Duration: Sep 15 2025Sep 18 2025

Publication series

NameProceedings - 2025 IEEE International Conference on e-Science, eScience 2025

Conference

Conference21st IEEE International Conference on e-Science, eScience 2025
Country/TerritoryUnited States
CityChicago
Period09/15/2509/18/25

Funding

This research used resources from the Experimental Computing Laboratory at Oak Ridge National Laboratory, which is supported by the US Department of Energy s (DOE s) Office of Science of the under contract DE-AC05-00OR22725. This material is based upon work supported by the US Department of Energy, Office of Science s Advanced Scientific Computing Research program as part of the Advancements in Artificial Intelligence for Science program, Ellora and Durban projects, the Next Generation of Scientific Software Technologies program, S4PST and PESO projects, and the Scientific Discovery through Advanced Computing (SciDAC) program, RAPIDS2 SciDAC Institute for Computer Science, Data, and Artificial Intelligence. This research was also supported in part and by an appointment to DOE s Omni Technology Alliance Internship Program, sponsored by the DOE and administered by the Oak Ridge Institute for Science and Education. This manuscript has been authored by UT-Battelle LLC under contract DE-AC05-00OR22725 with the DOE. The publisher acknowledges the US government license to provide public access under the DOE Public Access Plan (https://energy.gov/downloads/doe-public-access-plan).

Keywords

  • AI
  • C/C++
  • CUDA
  • Fortran
  • HIP
  • Kokkos
  • Large Language Models
  • OpenACC
  • OpenMP
  • Parallel Programming

Fingerprint

Dive into the research topics of 'LLM-Driven Fortran-to-C/C++ Portability for Parallel Scientific Codes'. Together they form a unique fingerprint.

Cite this