Abstract
The partitioned global address space (PGAS) model is popular for applying a classic shared memory approach to large systems, but some classes of problems rely on large numbers of small remote memory accesses targeting random locations across the network. On modern interconnects this can overwhelm the network, leading to message rate inefficiencies. This small message problem can be solved through aggregation strategies, however these typically require undesirable code restructuring that is cumbersome to incorporate and maintain in user applications. A strategy called “aggregation contexts” aimed at alleviating this burden has previously been proposed for the OpenSHMEM PGAS API. Despite its potential, it has not yet been validated for scalability on large systems consisting of thousands of nodes, nor proven to be performance-portable, which are critical for its adoption. In this paper, we demonstrate the scalability and performance portability of aggregation contexts using up to 8192 nodes on ORNL’s Frontier system. Our study reveals good scaling patterns while also identifying further opportunities for performance improvements to make it even more effective.
| Original language | English |
|---|---|
| Title of host publication | Lecture Notes in Computer Science |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 103-113 |
| Number of pages | 11 |
| DOIs | |
| State | Published - 2025 |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Volume | LNCS 14564 |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Funding
This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This research used the Frontier and Andes resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. This work was funded through the Strategic Partnership Projects Funding Office via Los Alamos National Laboratory with IAN 61921590 for the project. Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy. gov/downloads/doe-public-access-plan). This research used the Frontier and Andes resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. This work was funded through the Strategic Partnership Projects Funding Office via Los Alamos National Laboratory with IAN 61921590 for the project.
Keywords
- OpenSHMEM
- aggregation contexts
- conveyors
- many-to-many communication patterns
- message aggregation