Abstract
High-performance computing (HPC) increasingly relies on heterogeneous architectures to achieve higher performance. In the Oak Ridge Leadership Facility (OLCF), Oak Ridge, TN, USA, this trend continues as its latest supercomputer, Summit, entered production in early 2019. The combination of IBM POWER9 CPU and NVIDIA V100 GPU, along with a fast NVLink2 interconnect and other latest technologies, pushes system performance to a new height and breaks the exascale barrier by certain measures. Due to Summit's powerful GPUs and much higher GPU-CPU ratio, offloading to accelerators becomes a requirement for any application, which intends to effectively use the system. To facilitate navigating a complex landscape of competing heterogeneous architectures, a collection of applications from a wide spectrum of scientific domains is selected for early adoption on Summit. In this article, the experience and lessons learned are summarized, in the hope of providing useful guidance to address new programming challenges, such as scalability, performance portability, and software maintainability, for future application development efforts on heterogeneous HPC systems.
Original language | English |
---|---|
Article number | 8960361 |
Journal | IBM Journal of Research and Development |
Volume | 64 |
Issue number | 3-4 |
DOIs | |
State | Published - May 1 2020 |
Funding
The research projects described in this article used resources of the OLCF, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. Models in E3SM-MMF were obtained from the E3SM project, sponsored by the U.S. DOE, Office of Science, Office of Biological and Environmental Research. Development work on GronOR is part of the (Shell-NWO) research program of the Foundation for Fundamental Research on Matter, which is part of the Netherlands Organization for Scientific Research (NWO), and part of a European Joint Doctorate (EJD) in Theoretical Chemistry and Computational Modelling (TCCM), which has been financed under the framework of the Innovative Training Networks (ITN) of the Marie Skodowska-Curie Actions (ITN-EJD-642294-TCCM). FLASH was developed, in part, by the DOE NNSA ASC-and DOE Office of Science ASCR-supported Flash Center for Computational Science at the University of Chicago. Additional support for FLASH development was provided by the ECP (17-SC-20-SC), a collaborative effort of the U.S. DOE Office of Science and the NNSA.
Funders | Funder number |
---|---|
Netherlands Organization for Scientific Research | |
Office of Biological and Environmental Research | |
U.S. DOE | |
U.S. Department of Energy | |
Office of Science | DE-AC05-00OR22725 |
National Nuclear Security Administration | |
University of Chicago | 17-SC-20-SC |
H2020 Marie Skłodowska-Curie Actions | ITN-EJD-642294-TCCM |
Nederlandse Organisatie voor Wetenschappelijk Onderzoek |