archives

Deallocation patterns and linear types (e.g. Rust)

I've been [much] belatedly poking at Rust - particularly the move/borrow model, and I was then thinking about implications for allocation and deallocation performance. Though I haven't looked seriously, a couple of things occurred to me that I haven't seen people talking about. Nothing in any way profound here, but perhaps worth considering for those of you designing or implementing new languages, and it appears to me that this applies to many langugaes using linear type. Rust's reference counted pointers fall outside of what I'm going to describe.

I was pondering lifespans in Rust, thinking that there are probably cases where deallocations in Rust occur later (for naive lifespan reasons) than we might like - at least according to the simple description of lifespans in the specification, which is conservative. Also that the rules for object deallocation have the consequence that the cost of function return becomes hard to understand because it is difficult for a programmer to have good intuitions about deallocation overheads associated with the return. This got me to pondering deallocation and heap fragmentation to try to understand whether that's an actual problem, and then to wonder how many of Rust's allocations can be stack allocated and deallocated in the style of alloca(). So far as I can tell, the answer is most of them. I may have this wrong, but here's the intuition:

  1. In a given function, we know can usually determine statically where an allocated item (more precisely: the associated region) becomes unreachable. This, rather than reverse lexical order, is the point where the region created by the allocation actually dies. Where we can determine this point (and move semantics means we mostly can), we can sort allocations in "reverse region exit" order (last-out-first). This may not be the same as the allocation order, which is the point of interest here.
  2. For objects have statically-known size, this means that we can craft an arrangement of space on the local stack at compile time, such that the statically sized portion of the next-to-die region is always the last (most recent) chunk of storage on the stack.
  3. For regions whose exit order we cannot statically determine (do these exist in Rust?), we place them conservatively as if they will exit last. [This can be improved, but it is correct.] These objects can still have their destructors run in the expected order according to the Rust specification; what's being deferred (so far) is the release of the space.
  4. Some care is needed when a function returns a result in the same region as one (or more) of its arguments. We can (and must) assume that region is still live, but we cannot necessarily tell whether the content of the region has grown - an effect system can mitigate this, and there are other reasons to want an "allocation may happen when you call this" effect.
  5. Where all else fails, we resort to heap allocation, but the reference can still be placed according to order of region death.

If I have this right, region deallocation for statically sized objects is reduced to running destructors (in Rust: drop functions), handling any heap deallocations required to retire references, and then adjusting the stack pointer.

There are practical concerns - many operating systems make naive stack overflow tests that would need to be worked around - but that's neither new nor hard.

The handling of "uncertain death" regions can be improved. Even if we don't know the exact order of region death, we can generally determine some place in the region exit ordering where we know conservatively that a region has become unreachable.

It seems to me that we end up with three kinds of allocation areas:

  1. The stack, where deallocations are simply a stack pointer adjustment.
  2. The "linear object heap", where reachability is still governed in linear type fashion, but the dynamic nature of the allocations may cause objects in different regions to be interleaved. That is: the organization of the linear object heap does not obey a stack discipline.
  3. The "general object heap", which can be reference counted or garbage collected (or both - to eliminate dangling cycles.

The absence of reference counting (or analogous general object heap allocation mechanisms) in a program tells us whether we need to bind the collector into the binary.

For anybody whose thought carefully about implementing linear type, I suspect all of this is pretty obvious. I hadn't gotten quite that far in my thinking about heap management in BitC, though I had certainly been thinking about compiler-driven stack allocation.