Symmetric active/active high availability for high-performance computing system services: Accomplishments and limitations

C. Engelmann, S. L. Scott, C. Leangsuksun, X. He

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

This paper summarizes our efforts over the last 3-4 years in providing symmetric active/active high availability for high-performance computing (HPC) system services. This work paves the way for high-level reliability, availability and serviceability in extreme-scale HPC systems by focusing on the most critical components, head and service nodes, and by reinforcing them with appropriate high availability solutions. This paper presents our accomplishments in the form of concepts and respective prototypes, discusses existing limitations, outlines possible future work, and describes the relevance of this research to other, planned efforts.

Original languageEnglish
Title of host publicationProceedings CCGRID 2008 - 8th IEEE International Symposium on Cluster Computing and the Grid
Pages813-818
Number of pages6
DOIs
StatePublished - 2008
EventCCGRID 2008 - 8th IEEE International Symposium on Cluster Computing and the Grid - Lyon, France
Duration: May 19 2008May 22 2008

Publication series

NameProceedings CCGRID 2008 - 8th IEEE International Symposium on Cluster Computing and the Grid

Conference

ConferenceCCGRID 2008 - 8th IEEE International Symposium on Cluster Computing and the Grid
Country/TerritoryFrance
CityLyon
Period05/19/0805/22/08

Fingerprint

Dive into the research topics of 'Symmetric active/active high availability for high-performance computing system services: Accomplishments and limitations'. Together they form a unique fingerprint.

Cite this