Chaotic-identity maps for robustness estimation of exascale computations

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Exascale computing systems are expected to consist of millions of components, and the current engineering and manufacturing practices cannot guarantee their complete fault-free operation during the code executions lasting several hours. Consequently, the outputs of computations executed on them must be quantified with confidence estimates that reflect their failure-free execution. We propose (i) light-weight computational modules that utilize chaotic computations and customized identity maps to detect component failures, and (ii) statistical estimation methods that generate robustness estimates for the system and computations based on the module outputs. The diagnosis modules execute multiple Poincare and identity maps, which are customized to detect certain classes of failures in the compute nodes and interconnects. We propose statistical methods that generate robustness estimates for the system using the outputs of pipelined chains of diagnosis modules. These diagnosis modules can be inserted into application codes to identify failures, and generate confidence estimates for the application outputs. We present proof-of-principle simulation examples to illustrate the proposed approach.

Original languageEnglish
Title of host publication2012 IEEE/IFIP 42nd International Conference on Dependable Systems and Networks Workshops, DSN-W 2012
DOIs
StatePublished - 2012
Event2012 IEEE/IFIP 42nd International Conference on Dependable Systems and Networks Workshops, DSN-W 2012 - Boston, MA, United States
Duration: Jun 25 2012Jun 28 2012

Publication series

NameProceedings of the International Conference on Dependable Systems and Networks

Conference

Conference2012 IEEE/IFIP 42nd International Conference on Dependable Systems and Networks Workshops, DSN-W 2012
Country/TerritoryUnited States
CityBoston, MA
Period06/25/1206/28/12

Fingerprint

Dive into the research topics of 'Chaotic-identity maps for robustness estimation of exascale computations'. Together they form a unique fingerprint.

Cite this