Self-stabilizing iterative solvers

Piyush Sao, Richard Vuduc

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

70 Scopus citations

Abstract

We show how to use the idea of self-stabilization, which originates in the context of distributed control, to make faulttolerant iterative solvers. Generally, a self-stabilizing system is one that, starting from an arbitrary state (valid or invalid), reaches a valid state within a finite number of steps. This property imbues the system with a natural means of tolerating transient faults. We give two proof-of-concept examples of self-stabilizing iterative linear solvers: one for steepest descent (SD) and one for conjugate gradients (CG). Ourself-stabilized versions of SD and CG require small amounts of fault-detection, e.g., we may check only for NaNs and infinities. We test our approach experimentally by analyzing its convergence and overhead for different types and ratesof faults. Beyond the specific findings of this paper, we believe self-stabilization has promise to become a useful tool for constructing resilient solvers more generally.

Original languageEnglish
Title of host publicationProc. of ScalA 2013
Subtitle of host publicationWorkshop on Latest Adv. in Scalable Algorithms for Large-Scale Systems - Held in Conjunction with SC 2013: The Int. Conf. for High Perform. Comput., Networking, Storage and Anal.
DOIs
StatePublished - 2013
Externally publishedYes
EventWorkshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2013 - Held in Conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013 - Denver, CO, United States
Duration: Nov 17 2013Nov 21 2013

Publication series

NameProc. of ScalA 2013: Workshop on Latest Adv. in Scalable Algorithms for Large-Scale Systems - Held in Conjunction with SC 2013: The Int. Conf. for High Perform. Comput., Networking, Storage and Anal.

Conference

ConferenceWorkshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2013 - Held in Conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013
Country/TerritoryUnited States
CityDenver, CO
Period11/17/1311/21/13

Keywords

  • Fault-tolerance
  • Iterative linear solvers
  • Self-stabilization
  • Transient soft faults

Fingerprint

Dive into the research topics of 'Self-stabilizing iterative solvers'. Together they form a unique fingerprint.

Cite this