TY - GEN
T1 - Self-stabilizing iterative solvers
AU - Sao, Piyush
AU - Vuduc, Richard
PY - 2013
Y1 - 2013
N2 - We show how to use the idea of self-stabilization, which originates in the context of distributed control, to make faulttolerant iterative solvers. Generally, a self-stabilizing system is one that, starting from an arbitrary state (valid or invalid), reaches a valid state within a finite number of steps. This property imbues the system with a natural means of tolerating transient faults. We give two proof-of-concept examples of self-stabilizing iterative linear solvers: one for steepest descent (SD) and one for conjugate gradients (CG). Ourself-stabilized versions of SD and CG require small amounts of fault-detection, e.g., we may check only for NaNs and infinities. We test our approach experimentally by analyzing its convergence and overhead for different types and ratesof faults. Beyond the specific findings of this paper, we believe self-stabilization has promise to become a useful tool for constructing resilient solvers more generally.
AB - We show how to use the idea of self-stabilization, which originates in the context of distributed control, to make faulttolerant iterative solvers. Generally, a self-stabilizing system is one that, starting from an arbitrary state (valid or invalid), reaches a valid state within a finite number of steps. This property imbues the system with a natural means of tolerating transient faults. We give two proof-of-concept examples of self-stabilizing iterative linear solvers: one for steepest descent (SD) and one for conjugate gradients (CG). Ourself-stabilized versions of SD and CG require small amounts of fault-detection, e.g., we may check only for NaNs and infinities. We test our approach experimentally by analyzing its convergence and overhead for different types and ratesof faults. Beyond the specific findings of this paper, we believe self-stabilization has promise to become a useful tool for constructing resilient solvers more generally.
KW - Fault-tolerance
KW - Iterative linear solvers
KW - Self-stabilization
KW - Transient soft faults
UR - http://www.scopus.com/inward/record.url?scp=84892910930&partnerID=8YFLogxK
U2 - 10.1145/2530268.2530272
DO - 10.1145/2530268.2530272
M3 - Conference contribution
AN - SCOPUS:84892910930
SN - 9781450325080
T3 - Proc. of ScalA 2013: Workshop on Latest Adv. in Scalable Algorithms for Large-Scale Systems - Held in Conjunction with SC 2013: The Int. Conf. for High Perform. Comput., Networking, Storage and Anal.
BT - Proc. of ScalA 2013
T2 - Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2013 - Held in Conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013
Y2 - 17 November 2013 through 21 November 2013
ER -