Garbage Collection in Transactional Distributed Persistent Stores
An essential part of tomorrow's computing will be workers and organizations carrying out cooperative tasks interacting via shared information. The need for fault-tolerant sharing of data is well known in applications such as interactive CAD (for financial or engineering purposes) or cooperative tools, for example. Information is shared either concurrently or at different times, thus it must be available at different locations, persist beyond completion of a particular application, and remain consistent even in presence of faults.
The overall goal of a TPDS (Transactional Distributed Persistent Store) is to provide a platform that makes such data sharing simple, efficient, and fault-tolerant. For this purpose, a TPDS supports automatic and transparent persistence, memory-management, input-output, and distribution. With such a TPDS, programmers may concentrate on application development without being distracted by system issues such as memory bugs, caching, coherence, remote access, input-output, etc.
We can see the TPDS underlying model as a Shared Address Space with Persistence By Reachability (PBR). The address space spans every participating site in a network. This model is very attractive given its simplicity for application programmers. As a matter of fact, a shared address space enables programs to share data by using ordinary pointers. With this model, distributed programming is very simple, since it provides the same memory abstraction as in the centralized case. To provide the illusion of a shared address space across the network, although site memories are disjoint, a TPDS implements a distributed shared memory (DSM) mechanism. Data is replicated in multiple sites for performance and availability.
Accessing data (enclosed in a transaction) entails finding its reference by navigation from a well-known persistent root (e.g., a name server). In systems with PBR a datum reachable from the persistent root is persistent, thus it should persist on permanent storage. A datum transitively reachable from the persistent root is persistent also. Unreachable data is not needed and can be reclaimed (and memory compacted). Such data is said to be garbage. Reachability is accessed by tracing the pointer graph, starting from the persistent root, and reclaiming unreachable data. This can be done either via manual memory-management, or automatically via garbage collection. Manual memory-management is extremely error-prone as dangling pointers and memory leaks occur frequently. This results in the violation of the fundamental requirement of Referential Integrity: following a pointer should always work, i.e., if persistent datum x points to datum y, then y must be persistent also.
In a short lived application, the causes of a dangling pointer can be debugged, in spite of being difficult and tedious. The same applies to memory leaks; however, in some cases such leaks may not be a problem given that the application runs only for a short period of time. Thus, if in a centralized short-lived program garbage collection (GC) might be considered as an expensive luxury, in contrast, in a TPDS the nightmare of manual memory-management increases as the amount of data, pointers and users scales up. This is clarified by the following example.
Imagine a situation in which a datum y is erroneously deleted by the programmer. (Note that this kind of situations are extremely frequent even in simple applications.) From this moment on, a datum x that pointed to y, holds a dangling pointer. Then, the running application finishes and stores this data on disk. So far, the dangling pointer has not been used, thus there is no way to detect its existence. Some time later (possibly one or two weeks) another application runs; it accesses datum x, and legitimately tries to access y. At this moment the consequences of the dangling pointer show up by generating an error that prevents the application to continue. It is clear that debugging such a situation is completely unfeasible. Then GC becomes indispensable.
GC automatically ensures referential integrity, therefore improving program reliability and programming productivity. In addition, it may compact memory thus reducing fragmentation and improving locality.
The goal of this project can be summarized as follows: to improve programmer productivity, program reliability, and performance, by providing, inside a TPDS, a sub-system supporting fault-tolerant garbage collection. The main problems we tackle are now summarized:
- Consistency interference. The GC algorithm must not compete with transactions running on behalf of applications for holding consistent data. Such competition would interfere with applications functioning. For example, if the garbage collector requires to access exclusively a consistent datum, that would prevent an application to access a replica of that same datum, possibly locally cached, at the same time.
- Integration with fault-tolerance mechanisms. The GC algorithm must be well integrated, i.e., it must remain safe, live, and complete in presence of transactions, checkpointing, and versioning.
The GC itself must be fault-tolerant, i.e., it must be safe and live even in the presence of site failures or unreliable communication.
- Scalability. The GC algorithm must be capable of collecting a huge pointer graph without incurring into high performance costs due to computation, input-output, and synchronization. In particular, the collection of the whole graph in a single phase (e.g., running the collector as a single transaction accessing all persistent data) is clearly not scalable.
- Performance. Applications performance overhead due to GC must be minimized. In particular, interactive applications should not be disrupted by the collector in such a way that its user would be annoyed.
Check out the Overview Slides in PDF.
Sponsoring bodies: FCT
Coordinator: Paulo Ferreira