Abstract
We define the fundamental practices and criteria for evaluating and using the Meta Llama 3 and OpenAI ChatGPT 3.5 and 4o large language models (LLMs) to translate parallel scientific Fortran + OpenMP and Fortran + OpenACC codes to C/C++ codes that can leverage vendor-specific libraries (CUDA, HIP) for GPU acceleration in addition to other performance-portable programming models (e.g., Kokkos, OpenMP, OpenACC). In this study, LLMs are used to translate 11 different parallel Fortran codes with some of the most popular and widely used kernels/proxies in high-performance computing (HPC): AXPY, GEMV, GEMM, Jacobi, SpMV, and the >200-line Hartree-Fock application proxy, which implements a solver for quantum many-body systems. In all, we analyze the correctness and reproducibility of more than 1,650 AI-generated parallel C/C++ codes. Additionally, we evaluate the performance of Fortran codes and AI-generated C/C++ codes on two modern HPC architectures - one AMD EPYC Rome CPU with 64 cores and one NVIDIA Ampere A100 GPU. We use multi-modal prompting and fine-tuning techniques for LLMs to produce parallel scientific C/C++ codes with high levels of correctness (more than 95% of the codes are well ported) and speedups of up to an order of magnitude versus Fortran + OpenMP and Fortran + OpenACC codes on the same system.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2025 IEEE International Conference on e-Science, eScience 2025 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 385-394 |
| Number of pages | 10 |
| ISBN (Electronic) | 9798331591458 |
| DOIs | |
| State | Published - 2025 |
| Event | 21st IEEE International Conference on e-Science, eScience 2025 - Chicago, United States Duration: Sep 15 2025 → Sep 18 2025 |
Publication series
| Name | Proceedings - 2025 IEEE International Conference on e-Science, eScience 2025 |
|---|
Conference
| Conference | 21st IEEE International Conference on e-Science, eScience 2025 |
|---|---|
| Country/Territory | United States |
| City | Chicago |
| Period | 09/15/25 → 09/18/25 |
Funding
This research used resources from the Experimental Computing Laboratory at Oak Ridge National Laboratory, which is supported by the US Department of Energy s (DOE s) Office of Science of the under contract DE-AC05-00OR22725. This material is based upon work supported by the US Department of Energy, Office of Science s Advanced Scientific Computing Research program as part of the Advancements in Artificial Intelligence for Science program, Ellora and Durban projects, the Next Generation of Scientific Software Technologies program, S4PST and PESO projects, and the Scientific Discovery through Advanced Computing (SciDAC) program, RAPIDS2 SciDAC Institute for Computer Science, Data, and Artificial Intelligence. This research was also supported in part and by an appointment to DOE s Omni Technology Alliance Internship Program, sponsored by the DOE and administered by the Oak Ridge Institute for Science and Education. This manuscript has been authored by UT-Battelle LLC under contract DE-AC05-00OR22725 with the DOE. The publisher acknowledges the US government license to provide public access under the DOE Public Access Plan (https://energy.gov/downloads/doe-public-access-plan).
Keywords
- AI
- C/C++
- CUDA
- Fortran
- HIP
- Kokkos
- Large Language Models
- OpenACC
- OpenMP
- Parallel Programming