Atomic Broadcast in Asynchronous Crash-Recovery Distributed
Systems and its use in Quorum-Based Replication .
L. Rodrigues and M. Raynal
This report will be published in IEEE Transactions on Knowledge and
Data Engineering, Vol. 15, No 4, July/August 2003 (to appear).
Abstract
Atomic Broadcast is a fundamental problem of distributed systems:
it states that messages must be delivered in the same order to
their destination processes. This paper describes a solution to this
problem in asynchronous distributed systems in which processes can
crash and recover.
A Consensus-based solution to Atomic Broadcast problem has
been designed by Chandra and Toueg for asynchronous distributed
systems where crashed processes do not recover. Our extends this
approach: it transforms any Consensus protocol suited to the
crash-recovery model into an Atomic Broadcast protocol suited to
the same model. We show that Atomic Broadcast can be implemented
requiring few additional log operations in excess of those
required by the Consensus. The paper also discusses how additional
log operations can improve the protocol in terms of faster
recovery and better throughput.
To illustrate the use of the protocol, the paper also describes a
solution to the replica management problem in asynchronous
distributed systems in which processes can crash and recover. The
proposed technique makes a bridge between established results on
Weighted Voting and recent results on
the Consensus problem.
Not available online
Luís Rodrigues