Scalable Atomic Multicast

Luís Rodrigues and Rachid Guerraoui and André Schiper

Selected sections of this report will be published in the Proceedings of the Seventh International Conference on Computer Communications and Networks (IC3N'98), Lafayette, Louisiana, USA, 12-15 October, 1998.

Abstract

We present a new scalable fault-tolerant algorithm which ensures total order delivery of messages sent to multiple groups of processes. Our algorithm is particularly well suited for large scale systems because: (1) any process can multicast a message to one or more groups of processes without being forced to join those groups; (2) inter-group total order is ensured system-wide but, for each individual multicast, the number and size of messages exchanged depends only on the number of addressees; (3) process failure detection does not need to be reliable.

Our algorithm also exhibits a modular design. It uses two companion protocols, namely a reliable multicast protocol and a consensus protocol, and these protocols are not required to use the same communication channels or to share common variables with the total order protocol. This approach follows a design methodology based on the composition of (encapsulated) micro-protocols.

Also available extended report (gzip postscript), (pdf) .

Luís Rodrigues