Performance Portability of a Wilson Dslash Stencil Operator Mini-App Using Kokkos and SYCL

Balint Joo, Thorsten Kurth, M. A. Clark, Jeongnim Kim, Christian Robert Trott, Dan Ibanez, Daniel Sunderland, Jack Deslippe

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

22 Scopus citations

Abstract

We describe our experiences in creating mini-apps for the Wilson-Dslash stencil operator for Lattice Quantum Chromodynamics using the Kokkos and SYCL programming models. In particular we comment on the performance achieved on a variety of hardware architectures, limitations we have reached in both programming models and how these have been resolved by us, or may be resolved by the developers of these models.

Original languageEnglish
Title of host publicationProceedings of P3HPC 2019
Subtitle of host publicationInternational Workshop on Performance, Portability and Productivity in HPC - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages14-25
Number of pages12
ISBN (Electronic)9781728160030
DOIs
StatePublished - Nov 2019
Externally publishedYes
Event2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC, P3HPC 2019 - Denver, United States
Duration: Nov 22 2019 → …

Publication series

NameProceedings of P3HPC 2019: International Workshop on Performance, Portability and Productivity in HPC - Held in conjunction with SC 2019: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference2019 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC, P3HPC 2019
Country/TerritoryUnited States
CityDenver
Period11/22/19 → …

Funding

This work was funded by the U.S. Department of Energy under the Exascale Computing Project by the Office of Advanced Scientific Computing Research and through the Scientific Computing Through Advanced Discovery (SciDAC) program of the U.S. Deparment of Energy Offices of Nuclear Physics and Office of Advanced Scientific Computing Research (ASCR). B. Joó gratefully acknowledges the NERSC Exsascale Scientific Applications Program (NESAP) of NERSC for supporing a Summer Associateship at NERSC to work on the material presented in this paper. We gratefully acknowledge use of computer systems at NERSC, Jefferson Lab, Argonne Leadership Computing Facility and Oak Ridge Leadership Computing Facility for development and benchmarking during this work. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Nuclear Physics under contract DE-AC05-06OR23177. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honey-well International, Inc., for the U.S. Department of Energy?s National Nuclear Security Administration under contract DE-NA-0003525. This research is funded by the Exascale Computing Project and the Scientific Discover through Advanced Computing program of the U. S. Department of Energy, by the Offices of Advanced Scientific Computing Research (ASCR) and Nuclear Physics (NP). Authored by Jefferson Science Associates, LLC under U.S. DOE Contract No. DE-AC05-06OR23177. The U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce this manuscript for U.S. Government purposes.

FundersFunder number
U.S. Department of Energy
Office of ScienceDE-AC02-05CH11231, DE-AC02-06CH11357
National Nuclear Security AdministrationDE-NA-0003525
Advanced Scientific Computing Research
Nuclear PhysicsDE-AC05-00OR22725, DE-AC05-06OR23177

    Keywords

    • Kokkos
    • Lattice QCD
    • Performance
    • Portability
    • SYCL
    • Wilson Dslash

    Fingerprint

    Dive into the research topics of 'Performance Portability of a Wilson Dslash Stencil Operator Mini-App Using Kokkos and SYCL'. Together they form a unique fingerprint.

    Cite this