Language support for reliable memory regions

Saurabh Hukerikar, Christian Engelmann

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The path to exascale computational capabilities in highperformance computing (HPC) systems is challenged by the inadequacy of present software technologies to adapt to the rapid evolution of architectures of supercomputing systems. The constraints of power have driven system designs to include increasingly heterogeneous architectures and diverse memory technologies and interfaces. Future systems are also expected to experience an increased rate of errors, such that the applications will no longer be able to assume correct behavior of the underlying machine. To enable the scientific community to succeed in scaling their applications, and to harness the capabilities of exascale systems, we need software strategies that enable explicit management of resilience to errors in the system, in addition to locality of reference in the complex memory hierarchies of future HPC systems. In prior work, we introduced the concept of explicitly reliable memory regions, called havens. Memory management using havens supports reliability management through a region-based approach to memory allocations. Havens enable the creation of robust memory regions, whose resilient behavior is guaranteed by software-based protection schemes. In this paper, we propose language support for havens through type annotations that make the structure of a program’s havens more explicit and convenient for HPC programmers to use. We describe how the extended haven-based memory management model is implemented, and demonstrate the use of the language-based annotations to affect the resiliency of a conjugate gradient solver application.

Original languageEnglish
Title of host publicationLanguages and Compilers for Parallel Computing - 29th International Workshop, LCPC 2016, Revised Papers
EditorsChen Ding, John Criswell, Peng Wu
PublisherSpringer Verlag
Pages73-87
Number of pages15
ISBN (Print)9783319527086
DOIs
StatePublished - 2017
Event29th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2016 - Rochester, United States
Duration: Sep 28 2016Sep 30 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10136 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2016
Country/TerritoryUnited States
CityRochester
Period09/28/1609/30/16

Bibliographical note

Publisher Copyright:
© Springer International Publishing AG 2017.

Fingerprint

Dive into the research topics of 'Language support for reliable memory regions'. Together they form a unique fingerprint.

Cite this