An OpenMP GPU-offload implementation of a non-equilibrium solidification cellular automata model for additive manufacturing

Adrian S. Sabau, Lang Yuan, Jean Luc Fattebert, John A. Turner

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

In this paper, performance strategies on GPU-based HPC platforms of a cellular automata (CA) simulation code for non-equilibrium solidification, including nucleation, grain growth, solute partitioning and transport for the metal additive manufacturing (AM) process are investigated using OpenMP 4.5. To accurately report the speed-up for multicore CPUs and GPUs, a rigorous performance analysis employed optimizations appropriate for both CPU-only code (baseline) and GPU offload codes for an isothermal test problem. The performance results on Summit at the Oak Ridge Leadership Computing Facility indicate that using a precomputed list of interface cells significantly decreased the wall-clock time on GPUs. The speedup due to GPU acceleration was evaluated for a full Summit node and measured to be 1.8X when comparing a 6 MPI tasks run with 6 GPUs versus 36 MPI tasks on the CPU only. That speed-up was found to be 7.9X when comparing 6 MPI tasks with 6 GPUs versus the 6 MPI tasks running on the CPU only. Performance measurements showed that system total time is almost constant for runs with more than 96 MPI tasks (or GPUs), indicating that the GPU-accelerated code showed an excellent weak scaling performance. Finally, a rapid directional solidification problem was considered to demonstrate the CA code capability on Summit. It was found that a mesh size of at least 0.05 μm is recommended for the AM-like simulations in order to obtain accurate elongated grain microstructure and elongated subgrain features, which are in qualitative good agreement with experimental data. The results presented in this study indicate that the performance strategies on GPU-based HPC platforms for the CA code are appropriate for novel HPC exascale platforms.

Original languageEnglish
Article number108605
JournalComputer Physics Communications
Volume284
DOIs
StatePublished - Mar 2023

Funding

Notice: This submission was sponsored by a contractor of the United States Government under contract DE-AC05-00OR22725 with the United States Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).This research was conducted for the project “ExaAM: Transforming Additive Manufacturing through Exascale Simulation,” which is supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. The research was performed under the auspices of the US Department of Energy by Oak Ridge National Laboratory under contract No. DE-AC05-00OR22725, UT-Battelle, LLC. This research was conducted for the project “ExaAM: Transforming Additive Manufacturing through Exascale Simulation,” which is supported by the Exascale Computing Project ( 17-SC-20-SC ), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration . This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725 . The research was performed under the auspices of the US Department of Energy by Oak Ridge National Laboratory under contract No. DE-AC05-00OR22725 , UT-Battelle, LLC. Notice: This submission was sponsored by a contractor of the United States Government under contract DE-AC05-00OR22725 with the United States Department of Energy . The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ).

FundersFunder number
DOE Public Access Plan17-SC-20-SC
United States Government
U.S. Department of Energy
Office of ScienceDE-AC05-00OR22725
National Nuclear Security Administration
Oak Ridge National Laboratory

    Keywords

    • Additive manufacturing
    • Cellular automata
    • GPU
    • Solidification

    Fingerprint

    Dive into the research topics of 'An OpenMP GPU-offload implementation of a non-equilibrium solidification cellular automata model for additive manufacturing'. Together they form a unique fingerprint.

    Cite this