TY - GEN
T1 - A framework for proactive fault tolerance
AU - Vallée, Geoffroy
AU - Charoenpornwattana, Kulathep
AU - Engelmann, Christian
AU - Tikotekar, Anand
AU - Leangsuksun, Chokchai
AU - Naughton, Thomas
AU - Scott, Stephen L.
PY - 2008
Y1 - 2008
N2 - Fault tolerance is a major concern to guarantee availability of critical services as well as application execution. Traditional approaches for fault tolerance include checkpoint/restart or duplication. However it is also possible to anticipate failures and pro actively take action before failures occur in order to minimize failure impact on the system and application execution. This document presents a proactive fault tolerance framework. This framework can use different proactive fault tolerance mechanisms, i.e., migration and pause/unpause. The framework also allows the implementation of new proactive fault tolerance policies thanks to a modular architecture. A first proactive fault tolerance policy has been implemented and preliminary experimentations have been done based on system-level virtualization and compared with results obtained by simulation.
AB - Fault tolerance is a major concern to guarantee availability of critical services as well as application execution. Traditional approaches for fault tolerance include checkpoint/restart or duplication. However it is also possible to anticipate failures and pro actively take action before failures occur in order to minimize failure impact on the system and application execution. This document presents a proactive fault tolerance framework. This framework can use different proactive fault tolerance mechanisms, i.e., migration and pause/unpause. The framework also allows the implementation of new proactive fault tolerance policies thanks to a modular architecture. A first proactive fault tolerance policy has been implemented and preliminary experimentations have been done based on system-level virtualization and compared with results obtained by simulation.
UR - http://www.scopus.com/inward/record.url?scp=49049111154&partnerID=8YFLogxK
U2 - 10.1109/ARES.2008.171
DO - 10.1109/ARES.2008.171
M3 - Conference contribution
AN - SCOPUS:49049111154
SN - 0769531024
SN - 9780769531021
T3 - ARES 2008 - 3rd International Conference on Availability, Security, and Reliability, Proceedings
SP - 659
EP - 664
BT - ARES 2008 - 3rd International Conference on Availability, Security, and Reliability, Proceedings
T2 - 3rd International Conference on Availability, Security, and Reliability, ARES 2008
Y2 - 4 March 2008 through 7 March 2008
ER -