The case for explicit reuse semantics for RDMA communication

Scott Levy, Patrick Widener, Craig Ulmer, Todd Kordenbrock

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Remote Direct Memory Access (RDMA) is an increasingly important technology in high-performance computing (HPC). RDMA provides low-latency, high-bandwidth data transfer between compute nodes. Additionally, it does not require explicit synchronization with the destination processor. Eliminating unnecessary synchronization can significantly improve the communication performance of large-scale scientific codes. A long-standing challenge presented by RDMA communication is mitigating the cost of registering memory with the network interface controller (NIC). Reusing memory once it is registered has been shown to significantly reduce the cost of RDMA communication. However, existing approaches for reusing memory rely on implicit memory semantics. In this paper, we introduce an approach that makes memory reuse semantics explicit by exposing a separate allocator for registered memory. The data and analysis in this paper yield the following contributions: (i) managing registered memory explicitly enables efficient reuse of registered memory; (ii) registering large memory regions to amortize the registration cost over multiple user requests can significantly reduce cost of acquiring new registered memory; and (iii) reducing the cost of acquiring registered memory can significantly improve the performance of RDMA communication. Reusing registered memory is key to high-performance RDMA communication. By making reuse semantics explicit, our approach has the potential to improve RDMA performance by making it significantly easier for programmers to efficiently reuse registered memory.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages879-888
Number of pages10
ISBN (Electronic)9781728174457
DOIs
StatePublished - May 2020
Externally publishedYes
Event34th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020 - New Orleans, United States
Duration: May 18 2020May 22 2020

Publication series

NameProceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020

Conference

Conference34th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
Country/TerritoryUnited States
CityNew Orleans
Period05/18/2005/22/20

Funding

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy s National Nuclear Security Administration under contract DE-NA0003525. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.

FundersFunder number
U.S. Department of Energy s National Nuclear Security Administration
U.S. Department of Energy
National Nuclear Security AdministrationDE-NA0003525

    Keywords

    • HPC
    • Memory management
    • Messaging
    • RDMA

    Fingerprint

    Dive into the research topics of 'The case for explicit reuse semantics for RDMA communication'. Together they form a unique fingerprint.

    Cite this