Abstract
The approach of the next-generation computing platforms offers a tremendous opportunity to advance the state-of-the-art in global atmospheric dynamical models. We detail our incremental approach to utilize this emerging technology by enhancing concurrency within the High-Order Method Modeling Environment (HOMME) atmospheric dynamical model developed at the National Center for Atmospheric Research (NCAR). The study focused on improvements to the performance of HOMME which is a Fortran 90 code with a hybrid (MPIOpenMP) programming model. The article describes the changes made to the use of message passing interface (MPI) and OpenMP as well as single-core optimizations to achieve significant improvements in concurrency and overall code performance. For our optimization studies, we utilize the “Cori” system with an Intel Xeon Phi Knights Landing processor deployed at the National Energy Research Supercomputing Center and the “`Cheyenne” system with an Intel Xeon Broadwell processor installed at the NCAR. The results from the studies, using “workhorse” configurations performed at NCAR, show that these changes have a transformative impact on the computational performance of HOMME. Our improvements have shown that we can effectively increase potential concurrency by efficiently threading the vertical dimension. Further, we have seen a factor of two overall improvement in the computational performance of the code resulting from the single-core optimizations. Most notably from the work is that our incremental approach allows for high-impact changes without disrupting existing scientific productivity in the HOMME community.
Original language | English |
---|---|
Pages (from-to) | 1030-1045 |
Number of pages | 16 |
Journal | International Journal of High Performance Computing Applications |
Volume | 33 |
Issue number | 5 |
DOIs | |
State | Published - Sep 1 2019 |
Externally published | Yes |
Funding
The authors would like to acknowledge the support made by National Energy Research Scientific Computing Center (NERSC) through the Exascale Science Application Program (NESAP). Additionally, they would also like to thank Mark Taylor and Irina Dimensko of Sandia National Laboratory for optimizations with the limiter, Martyn Corden from Intel and Marcus Wagner from Cray for their insightful feedback on early drafts of the article, Bob Walkup from Watson Research, IBM for his assistance with the MPI profiling and Vprof performance tools, James Rosinski from NOAA/ESRL for providing changes needed for the implementation of GPTL in nested regions, and Helen He from NERSC for her invaluable assistance in helping with the technical issues on Cori at NERSC. The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This effort was supported by the National Science Foundation under grant (NSF01) and by an Intel Parallel Computing Center for Weather and Climate Simulation.
Funders | Funder number |
---|---|
Intel Parallel Computing Center | |
NOAA/ESRL | |
National Energy Research Scientific Computing Center | |
National Science Foundation | NSF01 |
Intel Corporation | |
International Business Machines Corporation |
Keywords
- Performance engineering
- code optimization
- dynamical core
- earth system modeling
- high-performance computing application
- spectral element