A High-Performance Design for Hierarchical Parallelism in the QMCPACK Monte Carlo code

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

We introduce a new high-performance design for parallelism within the Quantum Monte Carlo code QMCPACK. We demonstrate that the new design is better able to exploit the hierarchical parallelism of heterogeneous architectures compared to the previous GPU implementation. The new version is able to achieve higher GPU occupancy via the new concept of crowds of Monte Carlo walkers, and by enabling more host CPU threads to effectively offload to the GPU. The higher performance is expected to be achieved independent of the underlying hardware, significantly improving developer productivity and reducing code maintenance costs. Scientific productivity is also improved with full support for fallback to CPU execution when GPU implementations are not available or CPU execution is more optimal.

Original languageEnglish
Title of host publicationProceedings of HiPar 2022
Subtitle of host publication3rd Workshop on Hierarchical Parallelism for Exascale Computing, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages22-27
Number of pages6
ISBN (Electronic)9781665463454
DOIs
StatePublished - 2022
Event3rd IEEE/ACM International Workshop on Hierarchical Parallelism for Exascale Computing, HiPar 2022 - Dallas, United States
Duration: Nov 13 2022Nov 18 2022

Publication series

NameProceedings of HiPar 2022: 3rd Workshop on Hierarchical Parallelism for Exascale Computing, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference3rd IEEE/ACM International Workshop on Hierarchical Parallelism for Exascale Computing, HiPar 2022
Country/TerritoryUnited States
CityDallas
Period11/13/2211/18/22

Funding

ACKNOWLEDGMENT This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

FundersFunder number
DOE Public Access Plan
U.S. Department of Energy
Office of ScienceDE-AC05-00OR22725, DE-AC02-06CH11357
National Nuclear Security Administration

    Keywords

    • GPUs
    • Heterogeneous computing
    • Monte Carlo

    Fingerprint

    Dive into the research topics of 'A High-Performance Design for Hierarchical Parallelism in the QMCPACK Monte Carlo code'. Together they form a unique fingerprint.

    Cite this