Exploring OpenSHMEM model to program GPU-based extreme-scale systems

Sreeram Potluri, Davide Rossetti, Donald Becker, Duncan Poole, Manjunath Gorentla Venkata, Oscar Hernandez, Pavel Shamis, M. Graham Lopez, Mathew Baker, Wendy Poole

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Extreme-scale systems with compute accelerators such as Graphical Processing Unit (GPUs) have become popular for executing scientific applications. These systems are typically programmed using MPI and CUDA (for NVIDIA based GPUs). However, there are many drawbacks to the MPI+CUDA approach. The orchestration required between the compute and communication phases of the application execution, and the constraint that communication can only be initiated from serial portions on the Central Processing Unit (CPU) lead to scaling bottlenecks. To address these drawbacks, we explore the viability of using OpenSHMEMfor programming these systems. In this paper, first, we make a case for supporting GPU-initiated communication, and suitability of the OpenSHMEMprogramming model. Second, we present NVSHMEM, a prototype implementation of the proposed programming approach, port Stencil and Transpose benchmarks which are representative of many scientific applications from MPI+CUDA model to Open-SHMEM, and evaluate the design and implementation of NVSHMEM. Finally, we provide a discussion on the opportunities and challenges of OpenSHMEMto program these systems, and propose extensions to Open-SHMEMto achieve the full potential of this programming approach.

Original languageEnglish
Title of host publicationOpenSHMEM and Related Technologies
Subtitle of host publicationExperiences, Implementations, and Technologies - 2nd Workshop, OpenSHMEM 2015, Revised Selected Papers
EditorsManjunath Gorentla Venkata, Pavel Shamis, Neena Imam, M. Graham Lopez
PublisherSpringer Verlag
Pages18-35
Number of pages18
ISBN (Print)9783319264271
DOIs
StatePublished - 2015
Event2nd Workshop on OpenSHMEM and Related Technologies, OpenSHMEM 2015 - Annapolis, United States
Duration: Aug 4 2015Aug 6 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9397
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd Workshop on OpenSHMEM and Related Technologies, OpenSHMEM 2015
Country/TerritoryUnited States
CityAnnapolis
Period08/4/1508/6/15

Funding

The work at NVIDIA is funded by U.S. Department of Energy under subcontract 7078610 with Lawrence Berkeley National Laboratory. The work at Oak Ridge National Laboratory (ORNL) is supported by the United States Department of Defense and used the resources of the Extreme Scale Systems Center located at the ORNL. In addition the authors would like to thank Stephen Poole (DoD) for his review of this work and many technical discussions that help shape the ideas presented in the paper.

FundersFunder number
U.S. Department of Defense
U.S. Department of Energy7078610
Oak Ridge National Laboratory
Lawrence Berkeley National Laboratory
NVIDIA

    Fingerprint

    Dive into the research topics of 'Exploring OpenSHMEM model to program GPU-based extreme-scale systems'. Together they form a unique fingerprint.

    Cite this