Abstract
OpenSHMEM is a highly efficient one-sided communication API that implements the PGAS parallel programming model, and is known for its low latency communication operations that can be mapped efficiently to RDMA capabilities of network interconnects. However, applications that use OpenSHMEM can be sensitive to point-to-point message rates, as many-to-many communication patterns can generate large amounts of small messages which tend to overwhelm network hardware that has predominantly been optimised for bandwidth over message rate. Additionally, many important emerging classes of problems such as data analytics are similarly troublesome for the irregular access patterns they employ. Message aggregation strategies have been proven to significantly enhance network performance, but their implementation often involves complex restructuring of user code, making them unwieldy. This paper shows how to combine the best qualities of message aggregation within the communication model of OpenSHMEM such that applications with small and irregular access patterns can improve network performance while maintaining their algorithmic simplicity. We do this by providing a path to a message aggregation framework called conveyors through a minimally intrusive OpenSHMEM extension introducing aggregation contexts that fit more naturally to the OpenSHMEM atomics, gets, and puts model. We test these extensions using four of the bale 3.0 applications which contain essential many-to-many access patterns to show how they can produce performance improvements of up to 65 ×.
Original language | English |
---|---|
Title of host publication | Euro-Par 2023 |
Subtitle of host publication | Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Proceedings |
Editors | José Cano, Marios D. Dikaiakos, George A. Papadopoulos, Miquel Pericàs, Rizos Sakellariou |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 32-46 |
Number of pages | 15 |
ISBN (Print) | 9783031396977 |
DOIs | |
State | Published - 2023 |
Event | 29th International European Conference on Parallel and Distributed Computing, Euro-Par 2023 - Limassol, Cyprus Duration: Aug 28 2023 → Sep 1 2023 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 14100 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 29th International European Conference on Parallel and Distributed Computing, Euro-Par 2023 |
---|---|
Country/Territory | Cyprus |
City | Limassol |
Period | 08/28/23 → 09/1/23 |
Funding
Notice: This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/ downloads/doe-public-access-plan). Acknowledgments. Notice: This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/ downloads/doe-public-access-plan).
Keywords
- Conveyors
- High Performance Computing
- Message Aggregation
- OpenSHMEM
- PGAS