Dynamic Region Inference

Dynamic Region Inference, by David Pereira and John Aycock:

We present a garbage collection scheme based on reference counting and region inference which, unlike the standard reference counting algorithm, handles cycles correctly. In our algorithm, the fundamental operations of region inference are performed dynamically. No assistance is required from the programmer or the compiler, making our algorithm particularly well-suited for use in dynamically-typed languages such as scripting languages. We provide a detailed algorithm and demonstrate how it can be implemented efficiently.

A novel garbage collector that solves reference counting's cycle problems by introducing "regions", which demarcate possibly cyclic subgraphs. These regions are updated by merge and split operations that occur on pointer update and incrementally on region allocation, respectively, ie. adding a pointer to B into aggregate C merges their regions, and trying to allocate a new region first attempts to split some random candidate region by computing the local reference counts via union-find of the region's members.

Obviously dynamic regions don't share contiguous storage like their static counterparts, so "regions" here are purely a logical concept to ensure completeness of reference counting. The implementation adds two words to each object, one for pointing to the object's current region, the other for a "next" pointer for the next object in the region.

The practicality of this approach isn't clear compared to other cycle detection algorithms, and no benchmarks are provided. I haven't found any follow-up work either.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Advantages unclear, separate union-find pass unnecessary?

The one clear advantage is that a region can be freed immediately when it's RC=0, but because region splitting is delayed, it seems to basically amount to tracing GC with incrementality benefits in only some cases. Am I missing something? If not, it's not clear that this is really a win, as the extra space overhead increases cache pressure which will swallow some of the performance advantages.

The region split algorithm also seems unnecessarily inefficient. The union-find can be inlined into the region+NEXT pointer update traversal, thus avoiding a second pass:

1. take a member of the region to split and assign it a fresh region (skip if already has a fresh region)
2. trace the objects reachable from that member and, a) add them to the fresh region if they still reference the region being split, or b) merge them with whatever fresh region they have,
3. goto 1 until all members of region to split are processed
4. update RC counts based on registers
5. delete any regions with RC=0

This is still union-find, but doing a single pass and using the region/RC machinery already in place. This should incur fewer reads and writes than running union-find using Tarjan's algorithm in the first pass, and then resetting all the region and NEXT pointers again on a second pass.

Edit: there's a lot of overhead and incremental maintenance that goes on in this algorithm that makes it unattractive. For instance, it requires two words in the header to link objects to a region and link objects into a set rooted at that region. This doesn't even include any header data needed for object layout or tags, so a real language now needs 3 words for the header, which is an absurd space overhead compared to 1 word, worst-case, for mark-sweep.

And that's not even including the region table overhead, which requires at least 3 words for each entry, each of which are allocated in 1:1 correspondence with objects at first (though entries will quickly be released on merge and reused). So really, we're talking about a space overhead of up to 6 words per object.