Computational Effort Reduction of Fault Tolerance in Scalable Agent Support Systems
No Thumbnail Available
Date
2008-09
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Addis Ababa University
Abstract
Distributed systems are characterized by partial failures. These partial failures might prohibit the application from performing a system-wide computation and reaching consensus among its components (probably agents). Partial failure might happen due to several reasons, but one of the reasons is the Byzantines failure.
Many researches have dealt with Byzantine failure by proposing different algorithms to solve the problem. Most of the algorithms focus on how to reach on agreement and tolerate fault, while Query/Update (Q/U) protocol discusses the problem of fault tolerance with respect to scalability.
However, these algorithms do not take into account the computational effort it requires, thus making it unattractive for practical use. These techniques do not address the fault scalability of Byzantine faults thoroughly. Hence in this research we looked for better technique to address fault-scalability of Byzantine failures.
To address the Byzantine failure two approaches have been used: the agreement protocol and the quorum protocol. The quorum protocol has been proven to be fault scalable and efficient. The protocol has used the combination of the agreement based and quorum based approaches for the agreement sub-protocol. The agreement sub-protocol focuses on how to achieve a common value from quorums and between quorums. The election sub-protocol has to conduct election of a primary among the existing replicas.
The integration of AgentScape/DARX has been used in the experiment to test the fault tolerance in scalable agent support systems. Performance evaluation through experimentation has been done to assess the scalability fault tolerance support system.
xi
The protocol executes requests in three one-way message delays, which is acceptable latency for agreement on a client request. The protocol is robust to increasing the number of tolerated Byzantine faults and continues to provide significantly higher throughput, achieving scalability feature of the fault tolerant multi-agent support system.
Keywords: Multi-agent support system, Scalable fault tolerance, Byzantine faults
Description
Keywords
Multi-Agent Support System; Scalable Fault Tolerance; Byzantine Faults