A case study of CUDA FORTRAN and OpenACC for an atmospheric climate kernel

Matthew Norman, Jeffrey Larkin, Aaron Vose, Katherine Evans

Research output: Contribution to journalArticlepeer-review

38 Scopus citations

Abstract

The porting of a key kernel in the tracer advection routines of the Community Atmosphere Model - Spectral Element (CAM-SE) to use Graphics Processing Units (GPUs) using OpenACC is considered in comparison to an existing CUDA FORTRAN port. The development of the OpenACC kernel for GPUs was substantially simpler than that of the CUDA port. Also, OpenACC performance was about 1.5× slower than the optimized CUDA version. Particular focus is given to compiler maturity regarding OpenACC implementation for modern FORTRAN, and it is found that the Cray implementation is currently more mature than the PGI implementation. Still, for the case that ran successfully on PGI, the PGI OpenACC runtime was slightly faster than Cray. The results show encouraging performance for OpenACC implementation compared to CUDA while also exposing some issues that may be necessary before the implementations are suitable for porting all of CAM-SE. Most notable are that GPU shared memory should be used by future OpenACC implementations and that derived type support should be expanded.

Original languageEnglish
Pages (from-to)1-6
Number of pages6
JournalJournal of Computational Science
Volume9
DOIs
StatePublished - Jul 1 2015

Funding

This research used resources of the National Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725 .

Keywords

  • CUDA
  • Climate
  • GPU
  • HPC
  • OpenACC

Fingerprint

Dive into the research topics of 'A case study of CUDA FORTRAN and OpenACC for an atmospheric climate kernel'. Together they form a unique fingerprint.

Cite this