Abstract
The suitability of a spectral element based dynamical core (HOMME) within the Community Atmospheric Model (CAM) for GPU-based architectures is examined and initial performance results are reported. This work was done within a project to enable CAM to run at high resolution on next-generation, multi-petaflop systems. The dynamical core is the present focus because it dominates the performance profile of our target problem. HOMME enjoys good scalability due to its underlying cubed-sphere mesh with full two-dimensional decomposition and the localization of all computational work within each element. The thread blocking and code changes that allow HOMME to effectively use GPUs are described along with a rewritten vertical remapping scheme, which improves performance on both CPUs and GPUs. Validation of results in the full HOMME model is also described. We demonstrate that the most expensive kernel in the model executes more than three times faster on the GPU than the CPU. These improvements are expected to provide improved efficiency when incorporated into the full model that has been configured for the target problem. Remaining issues affecting performance include optimizing the boundary exchanges for the case of multiple spectral elements being computed on the GPU.
Original language | English |
---|---|
Pages (from-to) | 335-347 |
Number of pages | 13 |
Journal | International Journal of High Performance Computing Applications |
Volume | 27 |
Issue number | 3 |
DOIs | |
State | Published - Aug 2013 |
Funding
MAT and KJE have been supported by the DOE BER SciDAC project, ‘A Scalable and Extensible Earth System’. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract Number DE-AC05-00OR22725.
Funders | Funder number |
---|---|
DOE BER | |
U.S. Department of Energy | DE-AC05-00OR22725 |
Office of Science |
Keywords
- CAM
- GPU
- HOMME
- scalability
- tracer