Larchant: garbage collection in a cached distributed shared store with persistence by reachability
The model of Larchant is that of a Shared Address Space
(spanning every site in a network including secondary storage) with Persistence
By Reachability. To provide the illusion of a shared address space
across the network, despite the fact that site memories are disjoint, Larchant
implements a distributed shared memory mechanism. Reachability is
accessed by tracing the pointer graph, starting from the persistent root,
and reclaiming unreachable objects. This is the task of Garbage Collection
(GC). GC was until recently thought to be intractable in a large-scale
system, due to problems of scale, incoherence, asynchrony, and performance.
This thesis presents the solutions that Larchant proposes to these problems.
The GC algorithm in Larchant combines tracing and reference-listing. It
traces whenever economically feasible, i.e., as long as the memory subset
being collected remains local to a site, and counts references that would
cost I/O traffic to trace. GC is orthogonal to coherence, i.e., makes progress
even if only incoherent replicas are locally available. The garbage collector
runs concurrently and asynchronously to applications. The reference-listing
boundary changes dynamically and seamlessly, and independently at each
site, in order to collect cycles of unreachable objects. We prove formally
that our GC algorithm is correct, i.e., it is safe and live. The performance
results from our Larchant prototype show that our design goals (scalability,
coherence orthogonality, and good performance) are fulfilled.
Keywords: distributed systems, object-oriented systems, persistence
by reachability, distributed shared memory, distributed garbage collection.