Large-Scale Peer-to-Peer Autonomic Monitoring
Selected sections of this report were accepted for publication in the Proceedings of the Distributed Autonomous Network Management Systems Workshop (DANMS'08), New Orleans, USA, October 2008.
The increasing scale and complexity of distributed system motivates the need for autonomous management. One of the key aspects in the management of distributed systems is the issue of component monitoring. Component monitoring is particularly challenging in large-scale dynamic systems, given the need to ensure that each component is monitored by at least one non-faulty component, despite joins, leaves, and failures, both at node and at network level. This paper proposes that components self-organize in an unstructured overlay network of constant degree in order to ensure that each component is always monitored by a threshold of other components.