Practical scalable consensus for pseudo-synchronous distributed systems

Thomas Herault, Aurelien Bouteiller, George Bosilca, Marc Gamell, Keita Teranishi, Manish Parashar, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations

Abstract

The ability to consistently handle faults in a distributed environment requires, among a small set of basic routines, an agreement algorithm allowing surviving entities to reach a consensual decision between a bounded set of volatile resources. This paper presents an algorithm that implements an Early Returning Agreement (ERA) in pseudo-synchronous systems, which optimistically allows a process to resume its activity while guaranteeing strong progress. We prove the correctness of our ERA algorithm, and expose its logarithmic behavior, which is an extremely desirable property for any algorithm which targets future exascale platforms. We detail a practical implementation of this consensus algorithm in the context of an MPI library, and evaluate both its efficiency and scalability through a set of benchmarks and two fault tolerant scientific applications.

Original languageEnglish
Title of host publicationProceedings of SC 2015
Subtitle of host publicationThe International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherIEEE Computer Society
ISBN (Electronic)9781450337236
DOIs
StatePublished - Nov 15 2015
EventInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015 - Austin, United States
Duration: Nov 15 2015Nov 20 2015

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
Volume15-20-November-2015
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Conference

ConferenceInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015
Country/TerritoryUnited States
CityAustin
Period11/15/1511/20/15

Funding

The authors would like to thank Robert Clay, Michael Heroux and Josep Gamell for interesting discussions related to this work. This work is partially supported by the NSF (award #1339820), and the CREST project of the Japan Science and Technology Agency (JST). This work is also partially supported by the U.S. Department of Energy (DOE) National Nuclear Security Administration (NNSA) Advanced Simulation and Computing (ASC) program. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.

FundersFunder number
Sandia Corporation
U.S. Department of Energy's National Nuclear Security AdministrationDE-AC04-94AL85000
National Science Foundation1339820
U.S. Department of Energy
National Sleep Foundation
Lockheed Martin Corporation
National Nuclear Security AdministrationASC
Sandia National Laboratories
Japan Science and Technology Agency
Core Research for Evolutional Science and Technology

    Keywords

    • MPI
    • agreement
    • fault-tolerance

    Fingerprint

    Dive into the research topics of 'Practical scalable consensus for pseudo-synchronous distributed systems'. Together they form a unique fingerprint.

    Cite this