TY - GEN
T1 - Fault resilient real-time design for NoC architectures
AU - Zimmer, Christopher
AU - Mueller, Frank
PY - 2012
Y1 - 2012
N2 - Performance and time to market requirements cause many real-time designers to consider components, off the shelf (COTS) for real-time cyber-physical systems. Massive multi-core embedded processors with network-on-chip (NoC) designs to facilitate core-to-core communication are becoming common in COTS. These architectures benefit real-time scheduling, but they also pose predictability challenges. In this work, we develop a framework for Fault Observant and Correcting Real-Time Embedded design (Forte) that utilizes massive multi-core NoC designs to reduce overhead by up to an order of magnitude and to lower jitter in systems via utilizing message passing instead of shared memory as the means for intra-processor communication. Message passing, which is shown to improve the overall scalability of the system, is utilized as the basis for replication and task rejuvenation. This improves fault resilience by orders of magnitude. To our knowledge, this work is the first to systematically map real-time tasks onto massive multi-core processors with support for fault tolerance that considers NoC effects on scalability on an real hardware platform and not just in simulation.
AB - Performance and time to market requirements cause many real-time designers to consider components, off the shelf (COTS) for real-time cyber-physical systems. Massive multi-core embedded processors with network-on-chip (NoC) designs to facilitate core-to-core communication are becoming common in COTS. These architectures benefit real-time scheduling, but they also pose predictability challenges. In this work, we develop a framework for Fault Observant and Correcting Real-Time Embedded design (Forte) that utilizes massive multi-core NoC designs to reduce overhead by up to an order of magnitude and to lower jitter in systems via utilizing message passing instead of shared memory as the means for intra-processor communication. Message passing, which is shown to improve the overall scalability of the system, is utilized as the basis for replication and task rejuvenation. This improves fault resilience by orders of magnitude. To our knowledge, this work is the first to systematically map real-time tasks onto massive multi-core processors with support for fault tolerance that considers NoC effects on scalability on an real hardware platform and not just in simulation.
UR - http://www.scopus.com/inward/record.url?scp=84861487943&partnerID=8YFLogxK
U2 - 10.1109/ICCPS.2012.16
DO - 10.1109/ICCPS.2012.16
M3 - Conference contribution
AN - SCOPUS:84861487943
SN - 9780769546957
T3 - Proceedings - 2012 IEEE/ACM 3rd International Conference on Cyber-Physical Systems, ICCPS 2012
SP - 75
EP - 84
BT - Proceedings - 2012 IEEE/ACM 3rd International Conference on Cyber-Physical Systems, ICCPS 2012
T2 - 2012 IEEE/ACM 3rd International Conference on Cyber-Physical Systems, ICCPS 2012
Y2 - 17 April 2012 through 19 April 2012
ER -