Project Details
Description
This project will create a "Fault Tolerance Backplane" (FTB) and build the infrastructure necessary to enable systems to adapt to faults in a holistic manner. The approach will beto design a reference implementation of a fault awareness and notification backplane to provide common, uniform, event-handling and notification mechanisms for fault-aware libraries and middleware; create an interface specification that allows libraries, run-time systems, and applications to connect to and use the fault-tolerant backplane; and extend key libraries and applications to validate the interface choices, and to form the critical mass necessary for adoption in the community. The FTB will be designed and built to provide light-weight coordination and rudimentary prediction capabilities. The FTB will allow applications to survive many types of errors. The project will initially work with chemistry and fusion applications and then extend the adaptive fault capabilities to other Scientific Discovery through Advanced Computing applications.
Status | Finished |
---|---|
Effective start/end date | 09/30/06 → 09/30/11 |
Funding
- U.S. Department of Energy