Leveled Garbage Collection

Leveled Garbage Collection by Guanshan Tong and Michael J. O'Donnell:

Generational garbage collection (GGC) is one of the most popular garbage collection techniques. GGC gains a performance advantage by performing minor collections on the younger objects in the heap, reducing the number of major collections of the whole heap. A promotion policy determines when an object moves from the younger generation to the older. The design of GGC has been justified by the plausible assumption that many objects die very young, and a few live a very long time. But, attempts to tune the performance of GGC by adjusting the promotion policy have been disappointing - only the simplest immediate promotion policy has proved attractive. The success of GGC is probably due to simplicity and to avoiding scans of the whole heap, rather than to accurate lifetime predictions.
This paper presents Leveled Garbage Collection (LGC), a new algorithm that is not based on object ages. It uses a heap structure and collection scheme similar to those of generational garbage collectors, and has a non-age-based promotion policy that doesn't promote all of the live objects, but still guarantees ample free space immediately after each garbage collection. By tuning LGC's promotion policy, we can often improve on GGC with immediate promotion.
Performance comparisons show that LGC outperforms GGC with immediate promotion policy in many cases, while losing only slightly on cases favorable to immediate promotion. LGC has a substantial advantage when the heap fits in main memory, and an even greater advantage as the heap gets paged to disk.

Leveled GC is based on a more general heuristic than generational GC, in that it tries to keep as many objects as possible in the nursery because minor collections are so much cheaper. What I found most interesting about this paper is that it scales well with virtual memory, which as we know can degrade performance significantly. They provide benchmarks demonstrating a marked difference when large heap sizes trigger paging (Section 5.2.2). LGC performance is hardly affected, while the runtime of generational GC degrades significantly.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Feedback Effects

A problem to be aware of in measuring GC performance in real-life scenarios: the expected GC performance profile may be taken into account by the architecture of the system being built.

Generational GC works brilliantly when all long-lived objects are allocated once, early on, and all short-lived objects are discarded frequently, almost to the point of copying or reallocating in preference to reusing an object. In designing systems in the past running on generational GC, I've explicitly taken this into account in the application architecture. The result is a system which has grown in symbiosis with generational GC, and is not likely to benefit from a significantly different approach.

I conjecture that objective assessment of GC algorithms in real-world scenarios is problematic for this reason.

how to improve it (?)


I speculate that it can be improved as follows:

LGC starts to promote after finding too many live objects. There is a better way, perhaps.

Part 1:

Instead, examine the rate at which live fast-generation objects are being found relative to the rate which new pointers are identified which need tracing. When that rate is too low to ensure sufficient free-space at the end, start promoting. When the rate rises high enough again, stop promoting.

Part 2:

Perhaps track the number of objects promoted from a given page and, if it crosses the threshold of the ratio of newspace to freespace needed for the collection overall, then promote all live objects from this page found during this minor collection.

Part 3: Scatter allocations by purpose. For example, environments and explicit CONSes on separate pages of the minor collection space.


Part 2: A cheap an inexpensive locality hack. It tends to clear contiguous regions and it tends to bet that closely placed objects will have similar lifetimes.

Part 3: helps to make the bet in Part 2 better.

Part 1: cheaply adds some noise to the selection of what gets promoted and bets trace-distance correlates with lifetime.

If this is really so much

If this is really so much better than generational GC, why isn't this implemented in the major VMs?

Some suggestions in no particular order

1. It contains none of the buzzwords, incremental, concurrent or parallel.
2. It was tested on SML not Java, the defacto testbed for GCs.
3. Looking (quickly) at the papers citing it directly or indirectly, many of its ideas (except non-age based generations) seem to appear in GGC systems, reducing its benefits.
4. Memory got cheaper and paging was less frequent so the benefits of paging friendly GC reduced.

I would also add: 5. The

I would also add:

5. The paper states that allocation is slightly more expensive (microbenchmarks in the paper show noticeable overhead for small problems). Reasonably assuming that allocation is only going to increase as progressively higher-level languages are implemented on VMs, any increase in allocation overhead should be offset by significant other advantages. Whether the advantages are here is unclear.

Yes, although the reason is

Yes, although the reason is that the sweep phase appears to be amortised over the allocation operation, presumably to help latency in a non-incremental collector. This is an idea that, for example, immix uses with an otherwise general GGC.