Abstract
Extreme-scale systems with compute accelerators such as Graphical Processing Unit (GPUs) have become popular for executing scientific applications. These systems are typically programmed using MPI and CUDA (for NVIDIA based GPUs). However, there are many drawbacks to the MPI+CUDA approach. The orchestration required between the compute and communication phases of the application execution, and the constraint that communication can only be initiated from serial portions on the Central Processing Unit (CPU) lead to scaling bottlenecks. To address these drawbacks, we explore the viability of using OpenSHMEMfor programming these systems. In this paper, first, we make a case for supporting GPU-initiated communication, and suitability of the OpenSHMEMprogramming model. Second, we present NVSHMEM, a prototype implementation of the proposed programming approach, port Stencil and Transpose benchmarks which are representative of many scientific applications from MPI+CUDA model to Open-SHMEM, and evaluate the design and implementation of NVSHMEM. Finally, we provide a discussion on the opportunities and challenges of OpenSHMEMto program these systems, and propose extensions to Open-SHMEMto achieve the full potential of this programming approach.
Original language | English |
---|---|
Title of host publication | OpenSHMEM and Related Technologies |
Subtitle of host publication | Experiences, Implementations, and Technologies - 2nd Workshop, OpenSHMEM 2015, Revised Selected Papers |
Editors | Manjunath Gorentla Venkata, Pavel Shamis, Neena Imam, M. Graham Lopez |
Publisher | Springer Verlag |
Pages | 18-35 |
Number of pages | 18 |
ISBN (Print) | 9783319264271 |
DOIs | |
State | Published - 2015 |
Event | 2nd Workshop on OpenSHMEM and Related Technologies, OpenSHMEM 2015 - Annapolis, United States Duration: Aug 4 2015 → Aug 6 2015 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 9397 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 2nd Workshop on OpenSHMEM and Related Technologies, OpenSHMEM 2015 |
---|---|
Country/Territory | United States |
City | Annapolis |
Period | 08/4/15 → 08/6/15 |
Funding
The work at NVIDIA is funded by U.S. Department of Energy under subcontract 7078610 with Lawrence Berkeley National Laboratory. The work at Oak Ridge National Laboratory (ORNL) is supported by the United States Department of Defense and used the resources of the Extreme Scale Systems Center located at the ORNL. In addition the authors would like to thank Stephen Poole (DoD) for his review of this work and many technical discussions that help shape the ideas presented in the paper.