Abstract
This talk will describe an implementation of MPI which extends the message passing model to allow for recovery in the presence of a faulty process. Our implementation allows a user to catch the fault and then provide for a recovery. We will also touch on the issues related to using diskless checkpointing to allow for effective recovery of an application in the presence of a process fault.
Original language | English |
---|---|
Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Editors | Dieter Kranzlmuller, Peter Kacsuk, Jack Dongarra |
Publisher | Springer Verlag |
Pages | 6 |
Number of pages | 1 |
ISBN (Print) | 3540231633 |
DOIs | |
State | Published - 2004 |
Event | 11th European Conference on Parallel Virtual Machine and Message Passing Interface Users Group Meeting, PVM/MPI 2004 - Budapest, Hungary Duration: Sep 19 2004 → Sep 22 2004 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 3241 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 11th European Conference on Parallel Virtual Machine and Message Passing Interface Users Group Meeting, PVM/MPI 2004 |
---|---|
Country/Territory | Hungary |
City | Budapest |
Period | 09/19/04 → 09/22/04 |
Bibliographical note
Publisher Copyright:© Springer-Verlag Berlin Heidelberg 2004.