Abstract
Scalable and fault tolerant runtime environments are needed to support and adapt to the underlying libraries and hardware which require a high degree of scalability in dynamic large-scale environments. This paper presents a self-healing network (SHN) for supporting scalable and fault-tolerant runtime environments. The SHN is designed to support transmission of messages across multiple nodes while also protecting against recursive node and process failures. It will automatically recover itself after a failure occurs. SHN is implemented on top of a scalable fault-tolerant protocol (SFTP). The experimental results show that both the latest multicast and broadcast routing algorithms used in SHN are faster than the original SFTP routing algorithms.
| Original language | English |
|---|---|
| Title of host publication | Distributed and Parallel Systems |
| Subtitle of host publication | From Cluster to Grid Computing |
| Publisher | Springer US |
| Pages | 73-80 |
| Number of pages | 8 |
| ISBN (Print) | 0387698574, 9780387698571 |
| DOIs | |
| State | Published - 2007 |
| Externally published | Yes |
Keywords
- Fault tolerance
- Routing
- Runtime Environment
- Scalability
- Self-healing