Performance analysis of fully explicit and fully implicit solvers within a spectral element shallow-water atmosphere model

Katherine J. Evans, Richard K. Archibald, David J. Gardner, Matthew R. Norman, Mark A. Taylor, Carol S. Woodward, Patrick H. Worley

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Explicit Runge–Kutta methods and implicit multistep methods utilizing a Newton–Krylov nonlinear solver are evaluated for a range of configurations of the shallow-water dynamical core of the spectral element community atmosphere model to evaluate their computational performance. These configurations are designed to explore the attributes of each method under different but relevant model usage scenarios including varied spectral order within an element, static regional refinement, and scaling to the largest problem sizes. This analysis is performed within the shallow-water dynamical core option of a full climate model code base to enable a wealth of simulations for study, with the aim of informing solver development within the more complete hydrostatic dynamical core used for climate research. The limitations and benefits to using explicit versus implicit methods, with different parameters and settings, are discussed in light of the trade-offs with Message Passing Interface (MPI) communication and memory and their inherent efficiency bottlenecks. Given the performance behavior across the configurations analyzed here, the recommendation for future work using the implicit solvers is conditional based on scale separation and the stiffness of the problem. For the regionally refined configurations, the implicit method has about the same efficiency as the explicit method, without considering efficiency gains from a preconditioner. The potential for improvement using a preconditioner is greatest for higher spectral order configurations, where more work is shifted to the linear solver. Initial simulations with OpenACC directives to utilize a Graphics Processing Unit (GPU) when performing function evaluations show improvements locally, and that overall gains are possible with adjustments to data exchanges.

Original languageEnglish
Pages (from-to)268-284
Number of pages17
JournalInternational Journal of High Performance Computing Applications
Volume33
Issue number2
DOIs
StatePublished - Mar 1 2019

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This manuscript has been authored by UT-Bat-telle, LLC and used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, both of which are supported by the Office of Science of the U.S. Department of Energy (DOE) under contract no DE-AC05-00OR22725. This work was also partially supported under the auspices of DOE by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344, Lawrence Livermore National Security, LLC, LLNL-JRNL-716595 and Sandia National Laboratories, a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for DOE National Nuclear Security Administration under contract DE-NA0003525. Support for this work was provided through the Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy Office of Advanced Scientific Computing Research and Office of Biological and Environmental Research. The publisher, by accepting the article for publication, acknowledges that the United States Government retains a nonexclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. We acknowledge a Hackathon event hosted by the Oak Ridge Computing Facility for the opportunity and assistance adding OpenACC directives to our code. We thank Oksana Guba for providing a script to create the refined grid plots and two reviewers for their helpful comments that improved the manuscript. The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This manuscript has been authored by UT-Battelle, LLC and used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, both of which are supported by the Office of Science of the U.S. Department of Energy (DOE) under contract no DE-AC05-00OR22725. This work was also partially supported under the auspices of DOE by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344, Lawrence Livermore National Security, LLC, LLNL-JRNL-716595 and Sandia National Laboratories, a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for DOE National Nuclear Security Administration under contract DE-NA0003525. Support for this work was provided through the Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy Office of Advanced Scientific Computing Research and Office of Biological and Environmental Research.

Keywords

  • GPU acceleration
  • Global climate modeling
  • Newton–Krylov
  • implicit methods
  • regional refinement

Fingerprint

Dive into the research topics of 'Performance analysis of fully explicit and fully implicit solvers within a spectral element shallow-water atmosphere model'. Together they form a unique fingerprint.

Cite this