Abstract
Ytopt is a Python machine-learning-based autotuning software package developed within the ECP PROTEAS-TUNE project. The ytopt software adopts an asynchronous search framework that consists of sampling a small number of input parameter configurations and progressively fitting a surrogate model over the input-output space until exhausting the user-defined maximum number of evaluations or the wall-clock time. libEnsemble is a Python toolkit for coordinating workflows of asynchronous and dynamic ensembles of calculations across massively parallel resources developed within the ECP PETSc/TAO project. libEnsemble helps users take advantage of massively parallel resources to solve design, decision, and inference problems and expands the class of problems that can benefit from increased parallelism. In this paper we present our methodology and framework to integrate ytopt and libEnsemble to take advantage of massively parallel resources to accelerate the autotuning process. Specifically, we focus on using the proposed framework to autotune the ECP ExaSMR application OpenMC, an open source Monte Carlo particle transport code. OpenMC has seven tunable parameters some of which have large ranges such as the number of particles in-flight, which is in the range of 100,000 to 8 million, with its default setting of 1 million. Setting the proper combination of these parameter values to achieve the best performance is extremely time-consuming. Therefore, we apply the proposed framework to autotune the MPI/OpenMP offload version of OpenMC based on a user-defined metric such as the figure of merit (FoM) (particles/s) or energy efficiency energy-delay product (EDP) on Crusher at Oak Ridge Leadership Computing Facility. The experimental results show that we achieve the improvement up to 29.49% in FoM and up to 30.44% in EDP.
Original language | English |
---|---|
Journal | International Journal of High Performance Computing Applications |
DOIs | |
State | Accepted/In press - 2024 |
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the DOE ECP PROTEAS-TUNE project, in part by the DOE ASCR SciDAC RAPIDS2 and OASIS, in part by the ECP PETSc/TAO project, and in part by the ECP ExaSMR Project. We acknowledge the Oak Ridge Leadership Computing Facility for use of Crusher and Frontier under the projects CSC383 and Kevin Huck from University of Oregon for the power measurement support of APEX on Crusher. This material is based upon work supported by the U.S. Department of Energy, Office of Science, under contract number DE-AC02-06CH11357, at Argonne National Laboratory.
Funders | Funder number |
---|---|
Advanced Scientific Computing Research | |
U.S. Department of Energy | |
Argonne National Laboratory | |
Office of Science | DE-AC02-06CH11357 |
Office of Science |
Keywords
- Autotuning
- energy
- exascale computing project
- libEnsemble
- machine learning
- openmc
- performance
- ytopt