Passive Fault-Tolerance Management in Component-based Embedded Systems
Ref: CISTER-TR-150505 Publication Date: 2015
Passive Fault-Tolerance Management in Component-based Embedded SystemsRef: CISTER-TR-150505 Publication Date: 2015
It is imperative to accept that failures can and will occur even in metic- ulously designed distributed systems and to design proper measures to counter those failures. Passive replication minimizes resource consumption by only acti- vating redundant replicas in case of failures, as typically, providing and applying state updates is less resource demanding than requesting execution. However, most existing solutions for passive fault tolerance are usually designed and configured at design time, explicitly and statically identifying the most critical components and their number of replicas, lacking the needed flexibility to handle the runtime dynamics of distributed component-based embedded systems. This paper proposes a cost-effective adaptive fault tolerance solution with a significant lower overhead compared to a strict active redundancy-based approach, achieving a high error cov- erage with a minimum amount of redundancy. The activation of passive replicas is coordinated through a feedback-based coordination model that reduces the com- plexity of the needed interactions among components until a new collective global service solution is determined, hence improving the overall maintainability and ro- bustness of the system.
Published in Computing and Informatics (JCAI), Slovak Academy of Sciences, Volume 35, Issue 1, pp 23-44.