MPI + OpenACC: Accelerating radiation transport mini-application, minisweep, on heterogeneous systems

Robert Searles, Sunita Chandrasekaran, Wayne Joubert, Oscar Hernandez

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Architectures are rapidly evolving, and exascale machines are expected to offer billion-way concurrency. We need to rethink algorithms, languages and programming models among other components in order to migrate large scale applications and explore parallelism on these machines. Although directive-based programming models allow programmers to worry less about programming and more about science, expressing complex parallel patterns in these models can be a daunting task especially when the goal is to match the performance that the hardware platforms can offer. One such pattern is wavefront. This paper extensively studies a wavefront-based miniapplication for Denovo, a production code for nuclear reactor modeling. We parallelize the Koch–Baker–Alcouffe (KBA) parallel-wavefront sweep algorithm in the main kernel of Minisweep (the miniapplication) using CUDA 9.0, OpenMP 4.0 (SIMD) and OpenACC 2.6. Our OpenACC implementation running on NVIDIA's next-generation Volta GPU boasts an 85.06x speedup over serial code, which is larger than CUDA's 83.72x speedup over the same serial implementation. We also explore the scalability of our solution using MPI to decompose our simulation domain, allowing us to run on many nodes and accelerators present in state-of-the-art HPC systems. Our parallelization effort across platforms also motivated us to define an abstract parallelism model that is architecture independent, with a goal of creating software abstractions that can be used by applications employing the wavefront sweep motif. Program summary: Program Title: Minisweep Program Files doi: http://dx.doi.org/10.17632/cbcp37t8gf.1 Licensing provisions: BSD 2-clause Programming language: C Nature of problem: The Minisweep proxy application [1] is part of the Profugus radiation transport miniapp project [2] that reproduces the computational pattern of the sweep kernel of the Denovo Sn radiation transport code [3]. The sweep kernel is responsible for most of the computational expense (80%–99%) of Denovo. Denovo, a production code for nuclear reactor neutronics modeling, is in use by a current DOE INCITE project to model the International Thermonuclear Experimental Reactor (ITER) fusion reactor [4]. The many runs of this code required to perform reactor simulations at high node counts make it an important target for efficient mapping to accelerated architectures. Solution method: This work proposes an abstract parallelism model for efficiently mapping wavefront application to modern HPC architectures. Minisweep is used as a case study for evaluating this technique. Our evaluation is performed using OpenACC to target many architectures.

Original languageEnglish
Pages (from-to)176-187
Number of pages12
JournalComputer Physics Communications
Volume236
DOIs
StatePublished - Mar 2019

Funding

This manuscript has been authored by UT-Battelle , LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ). This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).This material is based upon work supported by the National Science Foundation (NSF) under grant no. 1814609. This material is based upon work supported by the National Science Foundation (NSF) under grant no. 1814609 .

Keywords

  • Acceleration
  • Minisweep
  • MPI + openACC
  • Radiation transport
  • Summit

Fingerprint

Dive into the research topics of 'MPI + OpenACC: Accelerating radiation transport mini-application, minisweep, on heterogeneous systems'. Together they form a unique fingerprint.

Cite this