Revisiting Credit Distribution Algorithms for Distributed Termination Detection

George Bosilca, Aurelien Bouteiller, Thomas Herault, Valentin Le Fevre, Yves Robert, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

This paper revisits distributed termination detection algorithms in the context of High-Performance Computing (HPC) applications. We introduce an efficient variant of the Credit Distribution Algorithm (CDA) and compare it to the original algorithm (HCDA) as well as to its two primary competitors: the Four Counters algorithm (4C) and the Efficient Delay-Optimal Distributed algorithm (EDOD). We analyze the behavior of each algorithm for some simplified task-based kernels and show the superiority of CDA in terms of the number of control messages.

Original languageEnglish
Title of host publication2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - In conjunction with IEEE IPDPS 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages611-620
Number of pages10
ISBN (Electronic)9781665435772
DOIs
StatePublished - Jun 2021
Externally publishedYes
Event2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - Virtual, Portland, United States
Duration: May 17 2021 → …

Publication series

Name2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021 - In conjunction with IEEE IPDPS 2021

Conference

Conference2021 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2021
Country/TerritoryUnited States
CityVirtual, Portland
Period05/17/21 → …

Funding

Acknowledgements: This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research was supported partly by the NSF project #1450300.

FundersFunder number
National Science Foundation1450300
U.S. Department of Energy
National Nuclear Security Administration

    Keywords

    • Termination detection
    • control messages
    • credit distribution algorithms
    • task-based HPC application

    Fingerprint

    Dive into the research topics of 'Revisiting Credit Distribution Algorithms for Distributed Termination Detection'. Together they form a unique fingerprint.

    Cite this