Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The OpenMP specification recently introduced support for unified shared memory, allowing implementation to leverage underlying system software to provide a simpler GPU offloading model where explicit mapping of variables is optional. Support for this feature is becoming more available in different OpenMP implementations on several hardware platforms. A deeper understanding of the different implementation’s execution profile and performance is crucial for applications as they consider the performance portability implications of adopting a unified memory offloading programming style. This work introduces a benchmark tool to characterize unified memory support in several OepnMP compilers and runtimes, with emphasis on identifying discrepancies between different OpenMP implementations as to how they various memory allocation strategies interact with unified shared memory. The benchmark tool is used to characterize OpenMP compilers on three leading High Performance Computing platforms supporting different CPU and device architectures. The benchmark tool is used to assess the impact of enabling unified shared memory on the performance of memory-bound code, highlighting implementation differences that should be accounted for when applications consider performance portability across platforms and compilers.

Original languageEnglish
Title of host publicationOpenMP
Subtitle of host publicationAdvanced Task-Based, Device and Compiler Programming - 19th International Workshop on OpenMP, IWOMP 2023, Proceedings
EditorsSimon McIntosh-Smith, Tom Deakin, Michael Klemm, Bronis R. de Supinski, Jannis Klinkenberg
PublisherSpringer Science and Business Media Deutschland GmbH
Pages210-225
Number of pages16
ISBN (Print)9783031407437
DOIs
StatePublished - 2023
EventProceedings of the 19th International Workshop on OpenMP, IWOMP 2023 - Bristol, United Kingdom
Duration: Sep 13 2023Sep 15 2023

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14114 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceProceedings of the 19th International Workshop on OpenMP, IWOMP 2023
Country/TerritoryUnited Kingdom
CityBristol
Period09/13/2309/15/23

Funding

Acknowledgement. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering, and early testbed platforms, in support of the nation’s exascale computing imperative. The views and opinions of the authors do not necessarily reflect those of the U.S. government or Lawrence Livermore National Security, LLC neither of whom nor any of their employees make any endorsements, express or implied warranties or representations or assume any legal liability or responsibility for the accuracy, completeness, or usefulness of the information contained herein. This work was in parts prepared by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-CONF-827970). We also gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation at Argonne National Laboratory. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration, in particular its subproject SOLLVE. Acknowledgements. This research was supported by the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative, and the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computer Research, under Contract DE-AC02-06CH11357. Acknowledgments. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a US Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231. Acknowledgments. This research was supported by the Israeli Council for Higher Education (CHE) via the Data Science Research Center, Ben-Gurion University of the Negev, Israel; Intel Corporation (oneAPI CoE program); and the Lynn and William Frankel Center for Computer Science. Computational support was provided by the NegevHPC project [5] and Intel Developer Cloud [26]. The authors thank Re’em Harel, Israel Hen, and Gabi Dadush for their help and support. Acknowledgment. This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 951732. We acknowledge the Danish e-Infrastructure Cooperation (DeiC), Denmark, for awarding this project access to the LUMI supercomputer, owned by the EuroHPC Joint Undertaking, hosted by CSC (Finland) and the LUMI consortium through DeiC, Denmark, Compiler development (DeiC-DTU-N5-20230033). Lastly, we acknowledge DCC [4] for providing access to compute resources. Acknowledgement. Prepared by LLNL under Contract DE-AC52-07NA27344 (LL NL-CONF-849438) and supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research. Acknowledgement. This work was partially supported by DeiC National HPC (g.a. DeiC-DTU-N5-20230033) and by the “Compiler development” project (g.a. DeiC-DTU-N5-20230033). This work is supported by the Sao Paulo Research Foundation (grants 18/07446-8, 20/01665-0, and 18/15519-5).

FundersFunder number
Data Science Research Center
DeiC National HPCDeiC-DTU-N5-20230033
European High-Performance Computing Joint Undertaking951732
Intel Developer Cloud
Lynn and William Frankel Center for Computer Science
Office of Advanced Scientific Computer ResearchDE-AC02-06CH11357
Office of Science and National Nuclear Security Administration
U.S. Department of Energy organizations
U.S. Department of EnergyDE-AC05-00OR22725
Division of Chemistry
Intel Corporation
Office of Science
National Nuclear Security Administration
Advanced Scientific Computing Research
Lawrence Livermore National LaboratoryLLNL-CONF-827970, DE-AC52-07NA27344, LL NL-CONF-849438
Lawrence Berkeley National LaboratoryDE-AC02-05CH11231
Fundação de Amparo à Pesquisa do Estado de São Paulo20/01665-0, 18/07446-8, 18/15519-5
Ben-Gurion University of the Negev
Council for Higher Education

    Keywords

    • Offloading
    • OpenMP
    • Unified Shared Memory

    Fingerprint

    Dive into the research topics of 'Experimental Characterization of OpenMP Offloading Memory Operations and Unified Shared Memory Support'. Together they form a unique fingerprint.

    Cite this