Abstract
The cloud microphysics scheme, CASIM, and the radiation scheme, SOCRATES, are two computationally intensive parts within the Met Office's Unified Model (UM). This study enables CASIM and SOCRATES to use accelerated multi-core systems for optimal computational performance of the UM. Using profiling to guide our efforts, we refactored the code for optimal threading and kernel arrangement and implemented OpenACC directives manually or through the CLAW source-to-source translator. Initial porting results achieved 10.02x and 9.25x speedup in CASIM and SOCRATES respectively on 1 GPU compared with 1 CPU core. A granular performance analysis of the strategy and bottlenecks are discussed. These improvements will enable UM to run on heterogeneous computers and a path forward for further improvements is provided.
Original language | English |
---|---|
Title of host publication | Proceedings of the Platform for Advanced Scientific Computing Conference, PASC 2021 |
Publisher | Association for Computing Machinery, Inc |
ISBN (Electronic) | 9781450385633 |
DOIs | |
State | Published - Jul 5 2021 |
Event | 2021 Platform for Advanced Scientific Computing Conference, PASC 2021 - Virtual, Online, Switzerland Duration: Jul 5 2021 → Jul 9 2021 |
Publication series
Name | Proceedings of the Platform for Advanced Scientific Computing Conference, PASC 2021 |
---|
Conference
Conference | 2021 Platform for Advanced Scientific Computing Conference, PASC 2021 |
---|---|
Country/Territory | Switzerland |
City | Virtual, Online |
Period | 07/5/21 → 07/9/21 |
Funding
This research was supported by the U.S. Air Force LCMC collaboration with Oak Ridge National Laboratory (ORNL). The computational resources on Summit are provided by the Oak Ridge Leadership Computing Facility (OLCF) Director’s Discretion Project ATM112. The OLCF at Oak Ridge National Laboratory (ORNL) is supported by the Office of Science of the U.S. Department of Energy under Contract No.DE-AC05-00OR22725. Furthermore, we would like to acknowledge the contributions of Youngsung Kim at ORNL for the insightful suggestions on porting algorithm development and performance bottleneck detection. We also appreciate the great help on CLAW implementation from Valentim Clement at ORNL. This research was supported by the U.S. Air Force LCMC collaboration with Oak Ridge National Laboratory (ORNL). The computational resources on Summit are provided by the Oak Ridge Leadership Computing Facility (OLCF) Director's Discretion Project ATM112. The OLCF at Oak Ridge National Laboratory (ORNL) is supported by the Office of Science of the U.S. Department of Energy under Contract No.DE-AC05-00OR22725. Furthermore, we would like to acknowledge the contributions of Youngsung Kim at ORNL for the insightful suggestions on porting algorithm development and performance bottleneck detection.We also appreciate the great help on CLAW implementation from Valentim Clement at ORNL.
Keywords
- CASIM
- GPU porting
- OpenACC
- SOCRATES
- Unified model