Regions for Code GC?

I've read many papers on region-based memory management and inference, but none of them discuss region inference for functions and machine code. A VM with dynamic code loading and jitting, like the JVM, or a VM with hot code update, requires some form of automatic reclamation of unused functions. Anyone have a link to such a discussion? Or is garbage collection still the preferable method for automatic code reclamation?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Is your concern more bent

Is your concern more bent towards correctness or performance?

I'm not sure what you mean

I'm not sure what you mean by correctness. Do you mean some function is remains unreclaimed longer than it's truly live? That might be ok as long as there's some bound.

It just seems silly to adopt region-based memory management for data, if it cannot also handle code reclamation in a VM. I'm in exactly that position, so I'm wondering whether I should invest the time to adopt a region system if it can't been used to scope function lifetimes.

Code is data

What difficulty do you see in handling code? I imagine that treating code is not discussed because it's not a common case and there isn't any special problems.

Code is data. You should be able to treat it just the same. The advice given in papers about GCing code should be more or less just as applicable in this case.

Hmm, good question. I

Hmm, good question. I suppose I'm tripped up by code that, at first glance, does not seem to have analogues in data structures. For instance, the code immediately following an infinite recursion/loop. Such code, while referenced, will never be reached, and is in a sense dead, but I don't quite see how it could be reclaimed. Then again, GC probably wouldn't reclaim such code either, as reachability only approximates liveness. I'll have to think about this some more. Perhaps there really is no difference.

Granularity of code

A lot depends on how the code is structured. If code is segmented into blocks which are essentially a sequence of contiguous instructions (possibly pointers to other blocks, VM instructions or other atoms, etc); I would expect that a particular code snipped would remain live as long as any part of the enclosing block were live.

Such an approximation would probably be sufficiently good for practice.

Nested lifetimes aren't

Nested lifetimes aren't enough! Consider mutually recursive functions.

A standard GC

...should be able to deal with any cyclic dependencies among code snippets.

The more interesting question, of course, is how to extend a GC with "domain knowledge" to enable it to more aggressively determine that something is not live, and therefore reclaim it? The conservative approach--scanning the bits for things that look like pointers--is often too conservative, even when assisted by the type system (so it knows how to distinguish pointers from atoms). Collection of code snippets which are unreachable but referenced by some other code block, are but one case.