Who Needs Garbage Collection?

Here is the idea: Popular wisdom dictates that functional languages need garbage collection, but is it really true. For example C++ style move semantics seem to solve the problem of returning a closure from a function, you create an object to contain a copy of all the local variables and functions, and then swap the local copy of the handle with the calling functions allocation, so when the function returns it destroys the empty handle from the caller, leaving the caller with the handle to the closure.

On this basis any acyclic datatype can be stored in the heap, but its lifetime managed from the stack handle (this is what RAII is doing in C++ STL).

I guess this is a degenerate case of reference counting, where we limit references to two types, a unique 'owning' reference (lets call it a handle to disambiguate) , that when it goes out of scope releases the memory (its unique and un-copyable so no need to count references), and an 'ephemeral' reference (lets call it a pointer) that is restricted in that it cannot be 'leaked' to a scope more short lived than the scope in which the handle exists. This all sounds a lot like Ada access variables - but note the change from the scope in which the handle was created, to the any scope in which the handle exists, as returning handles by move semantics is possible. This allows a constructor to return a handle into a scope which is also allowed to hold pointers to the object.

It doesn't sound like it really needs anything new, just a combination of Ada's access variable, and the mechanism for converting closures into objects described above.

What potential problems might there be? Could this work, or is there some overlooked flaw?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Rust Lifetimes

Region-Based Memory Management

I remember reading about in old LtU threads about Cyclone or MLKit.
It seems that you needed 6 or 7 times more memory with only memory regions (no GC), so practical systems used memory regions and GC.

Not Region Based

This is not region based memory management, so I am not sure that the result is relevant, apart from confirming that region based management is not what you want for a low memory footprint.

Edit: meant to be a reply to "Region-Based Memory Management".

This is not region based

This is not region based memory management, so I am not sure that the result is relevant

In your post, you said:

I guess this is a degenerate case of reference counting, where we limit references to two types, a unique 'owning' reference (lets call it a handle to disambiguate) , that when it goes out of scope releases the memory (its unique and un-copyable so no need to count references), and an 'ephemeral' reference (lets call it a pointer) that is restricted in that it cannot be 'leaked' to a scope more short lived than the scope in which the handle exists.

That latter constraint is precisely lexically scoped regions. The "owner" is the region handle, the "ephemeral" reference is the local designating the object and the region to which it belongs. Regions are problematic because many objects tend to aggregate into a single large region whose lifetime is scoped to the entire program (hence the blowup in memory use).

Of course, this aggregation only happens in lieu of copying objects because we wish to preserve sharing (in case mutables and other forms of identity are involved). If you relax the features you want to provide, as you specified elsewhere in this thread by dropping mutables, then you can implement copy semantics and avoid some aggregation, but that's a pretty severe trade off.

I'd like to see a hybrid approach where types are analyzed and sometimes copied when they are small enough, and have no identity. This would necessitate passing runtime arguments to perform a dynamic check when this couldn't be statically determined.

Seems different

Seems different from the descriptions of region based memory management I have read. Maybe I am not describing it very well. The owner handle is not restricted to a 'region', in fact there is nothing like a region unless you simulate it with a vector of handles. Owner handles can be in local variables (on the stack), or in data structures (like a singly linked list of all owner handles). You can return the owner handle to the caller, passing it into functions may not do what you expect: the outer and inner scope handles would get swapped, so the object would be freed when the function it was passed to returns, leaving the outer scope with an empty handle. You do this if you are putting the owner handle into a data structure, for example inserting it in a tree.

The key is to understand that the only operation on an owner handle (apart from construction and destruction) is swap. A typical sequence might go like this:

f = {
  h1 = // construct new tree node.
  h2 = // construct new tree node.
  h1.branch = h2 // this swaps them
  return h1
}

h3 = f()

Here there are two handles on the stack in 'f' and both must be freed on exit, but h1 is swapped with h3, and h2 is swapped with an empty handle (called branch) inside h1. So two empty handles get freed when f returns, and h3 owns a node (originally h1) which has a handle 'branch' which owns what was originally h2.

The owner handle is not

The owner handle is not restricted to a 'region', in fact there is nothing like a region unless you simulate it with a vector of handles.

I'm saying the "scope" is the region in your scheme, since once the owner's scope exits, the memory is all reclaimed. Elaborating your example, where [] denotes implicit region parameters:

type Node 'a = Root | Node [%r1] { mutable branch : Node @ %r1; ... }
f [%r1,%r2 where %r2 <: %r1] = {
  h1 = Node[%r1] { branch = Root } // construct new tree node in %r1.
  h2 = Node[%r2] { branch = Root } // construct new tree node in %r2.
  h1.branch = h2                   // this has no region effect, since inference
  return h1                        // determined that %r2 subtypes %r1
}

h3 = f[%r,%r]()   // assuming %r region is in scope

The final line means h1 and h2 have the same lifetime, but different lifetimes are also possible, as long as the %r2 passed to f outlives %r1. So your scheme is more like ownership types, which can be encoded in regions; the problem is precise region inference.

A reference to something that sounds more like what you're asking about: Ownership You Can Count On: A Hybrid Approach to Safe Explicit Memory Management:

We present a new approach to memory management called alias counting which combines type-based object ownership and run-time reference counting of references by non-owning (aliasing) pointers. When an owning pointer is destroyed, if the target object's reference count is zero, the object is destroyed; otherwise a run-time error occurs.

Of course, a "run-time error" is only needed because they don't employ regions to statically type the scoping, which seems to be a goal you wanted.

no counting

Thanks for finding that. They use reference counting, which is kind of the reverse of what I want to do. The non owning pointers are not reference counted, they are just plain pointers, but we statically check they don't leave the scope where the owning handle is valid. This is like Ada's access types, but the scoping rule is slightly different.

I guess you can call it region based, if you want to call local variables in C/C++ region based as well. I have not heard it called that, would normally refer to it as stack based memory management. I think there are more restrictions on stack frames than regions in that they strictly follow the program call structure, a new frame being created on function entry, and being destroid on function exit.

I guess you can call it

The non owning pointers are not reference counted, they are just plain pointers, but we statically check they don't leave the scope where the owning handle is valid. This is like Ada's access types, but the scoping rule is slightly different.

But how do you check this statically? I originally said this sounded like a region constraint, because regions are the most expressive system to verify this property that I know of. It sounds like you have a simpler model in mind that wouldn't require any region annotations, so I'm interested in what it is.

I guess you can call it region based, if you want to call local variables in C/C++ region based as well. I have not heard it called that, would normally refer to it as stack based memory management.

The C stack is a degenerate region-based allocator. Every scope in C implicitly introduces a new region for all variables declared in that scope, you're just limited to allocating in the most nested region as opposed to any higher region.

Ada access objects

Ada access objects do similar checks to what I am suggesting. Effectively the unique handles will involve a phantom type, such that we can identify the reachability statically. For example with a linked list if an owning pointer to the first node is in scope a pointer to any node reachable from there is in scope. Effectively a function must be passed either an owning handle or ephemeral reference that is in scope in the caller, so it is safe to return any ephemeral reference that is reachable. The only tricky case is when the function returns a pointer to a new object, which is illegal unless the function also returns the owning handle or the new owning handle is inserted into a structure reachable from one of the owning handles passed in.

i'm slow

are you talking about pure fp? any functional language where you can have long-lived things seems to be a bigger problem than what you are talking about. furthermore any fp with parallelism/concurrency where i can pass the same datum off to 2+ things at the same time would be a problem? is it call by name or value? is is erlangish where only copies are allowed, not shared references?

there's linear types, there's uniqueness types, there's 'failed' typestate, there's variants of gc that are ref counting but with a way to break cycles, etc. a la Bacon et. al.

Maybe.

I guess its hard to think about the exact mechanisms without specifying some of the language semantics. Do you have an specifics on the problems with longer lived things?

To answer the questions: Parallel can be done via message passing (transfer of ownership), I would probably go for call-by-value. Shared references are allowed (pointers), but they have no ownership, rather they are not allowed to leave the scope which holds the handle.

Its not really any of those type things. Linear-types are for when the same 'address' may contain different types at different times, uniqueness types are when the value can only be used once, I think 'failed' type state is about uninitialised variables.

The question is about whether a usable functional language could exist without GC (that means no ref-counting or mark/sweep) just using 'stack' based memory control.

prescheme

well i want to learn to write in prescheme (somewhere in s48.org) and it doesn't have a gc, but it is meant really to only be more like an intermediate compiler target form iiuc. :-)

prescheme looks interesting

Apart from a general feeling of unease about languages with a spec as large as scheme's, the way prescheme does things seems kind of what I am looking for.

Which spec?

PreScheme dates from the R5RS era, whose spec is only 50 pages. including the whole standard library. Or do you mean you are uneasy about specs that short?

Prescheme isn't functional...

...although all prescheme programs are scheme programs.

The reference implementation is part of Scheme 48:

Mercurial repository for s48-stable: ps-compiler/prescheme/

It isn't a functional language, since all closures need to be allocated statically: you can't define such staple combinators as composition or fold, for instance; IIRC there is not full tail call elimination either. The attraction is that you have the full power of scheme with its macro language at the first stage of the compilation, and because prescheme is a subset of scheme, it is possible to turn regular scheme code into prescheme via in-language code transformation.

How deeply do you copy?

When you're building a closure, how deeply do you copy the locals? It looks like that's going to require attention from the programmer in general, so that there will be a bit of manual memory management involved. And there will always be cases where you have to introduce new copying to make something work where you would have just let the GC figure it out if you had one.

Move Them.

No need to copy, as the scope is disappearing, they can be moved. So we create an object with empty handles for all the captured variables, then swap the local handles for the object handles. When the scope ends, the empty handles are freed, the returned object contains the data that was in the local variables.

mutual recursion

Let C0 be a closure which is closed over a graph of mutually recursive closures. The set of these closures is CG = {C0, C1, C2, ...}

Let Cj (!= C0) in CG be closed over the same graph and set of closures CG.

Let Ck (!= C0, != Cj) in CG be closed over a subset graph and a subset of closures CGk.

Assume that C0 is not a member of CGk.

For example, CGk might be the set { C1, C3, C5, ..., Ck, ... }

Now the lifetime of all members of CG must be no less than the lifetime of C0.

The converse is not true. In particular, it is the case that the lifetime of C0 may be shorter than the lifetime of the members of CGk.

Similarly, the lifetime of C3 must be no shorter than the lifetime of C2 but the converse is not true. C2's lifetime may be shorter than C3's.

The closure C0 is first formed. It is returned to some caller in the mechanism you describe.

The caller will invoke C0.

Assume that C0 will return some Cx and that Cx is either Cj or Ck.

We don't know until run-time whether C0 will return Cj or Ck.

The caller will preserve the reference to Cx but will drop the reference to C0.

I believe that there is no general way other than graph tracing to decide, at that point, whether C0 is alive or dead. In our example sets, the liveliness of all members of CG - CGk = {C0, C2, C4, ...} are unknown without graph tracing.

Aside:

This is actually one of the reasons I've come to believe that the functional programming style is a bad idea and is a poor foundation for programming in general.

Handles are unique.

I am not sure its a problem. The rules of ownership only allow one 'handle' to exist. It is non-copyable. The only operation available is swap with an empty handle for the move-semantics.

So there cannot be any cycles of 'ownership'. It must be a DAG, and the lifetime of any contained object must always be shorter than the container.

'pointers' have no ownership, so all we have to do is to prevent them being accessed outside the scope which owns the handle to which the pointer is referring. This can be done statically as in Ada's 'access' variables.

closures and cycles

there cannot be any cycles of 'ownership'. It must be a DAG,

You could write an interpreter for something like pure lambda terms represented as finite trees, perhaps with some sharing. Then there is no mystery that you don't need graph-tracing GC but you do have some other practical problems to grapple with.

I understood the question in your post to contemplate a more conventional execution model.

If you are saying that a DAG restriction on closures prevents the construction of mutually recursive closures, those aren't closures. They're some more restricted closure-like thing.

there cannot be any cycles of 'ownership'. It must be a DAG,

In your scheme for pointer semantics, yes, that's the idea.

I've shown that that this idea can not represent the very natural concept of a graph of mutually-recursive closures (again, assuming a fairly conventional execution model).

Mutually recursive closures

How would you create mutually recursive closures? If I have a local scope and a return a function from it, it contains a closure, if that calls functions that are in closures, you would get a DAG of closures.

re: How would you create mutually recursive closures?

How would you create mutually recursive closures?

More conventional functional languages will at least offer something like a LETREC-style construct.

Alternatively, I don't have to tell you that in a low-level lambda style a program could instead directly use a fixed-point combinator.

If you propose a functional language in which a fixed-point combinator can not be defined and which lacks LETREC, "closure" is the wrong name for these acyclic objects you are talking about. The objects you are talking about are much more restricted than closures. These non-closure objects may be interesting and may be useful, but they are not "closures" as ordinarily meant.

If you propose a functional language that permits a fixed-point combinator and has an execution model based around something like finite trees representing lambda terms then, again, you very obviously don't need GC (we already knew that) but you have other hard problems to wrestle with (all kinds of performance issues, mainly).

Mutual recursion not a problem

I don't actually think mutual recursion is a problem:

let owner = 
   letrec f = ... (refers to g),
          g = ... (refers to f).

The closure objects of f and g would belong to the scope of 'owner' and would both be released together when owner exits, so they can hold ephemeral pointers to each other without problems. Note there can be no cycles in "ownership", but ownership is something different from a pointer to something.

I do see the issue that if owner returns a function that refers to 'f', 'g' would also need to go into the closure, so there would need to be some dependency graph chasing, but that would be at compile time, and cycles would not be problematic, you just need to track the nodes already visited.

owner vs. closure

Terrific. I have to say this is sort of funny to me because I went down a train of thought just like this in recent months

Let me point out some issues here.

First, "owner" is not a closure, it's an object that happens to have some mutually recursive methods. Now, that raises a question: why does that distinction matter for memory management?

Let's suppose that "owner" is a little more complicated:

let owner =
   letrec f = ... (refers to g).
          g = ... (refers to f and h).
          h = ... (refers to i).
          i = ... (refers to h)

If f .. i were ordinary closures, the program could discard all references to f and g, keeping a reference to h. Graph tracing could then discover that f and g are dead while h and i remain alive.

That is not possible here. In effect, the program can leak the storage consumed by f and g, at least until h finally dies.

So that's one way in which you haven't really solved the problem that closures force an implementation to provide graph tracing collection (or else leak storage).

A second observation is that at least in your notation, the pattern of mutual recursions is statically known. An f can call a g which can call an f or an h. An h can call an i which can call an h. The structure of the graph of references is statically known.

A graph of closures is more general than that. It's structure can be unknowable until run-time. For example, instead of knowing statically that an h can refer to an i, what exactly an h will refer to may depend on parameters supplied at run-time.

Function Closures

The idea would be that you have to be able to statically determine the the owner, but the ephemeral pointer can by dynamic. You can represent any closure graph, if you make the owner the main function, so the problem is not if it is possible, but how efficient can you make it by pushing the ownership as far down the call-graph as possible.

But I think you have captured the key part of the problem that I was missing when trying to understand the general requirement for GC. The other thing is just not to have closures, but closure like objects, it can still be functional.

What are the requirements on closures for simple lambda calculus (with no let bindings, just abstraction and application)?

re: requirements on closures for simple lambda calculus

What are the requirements on closures for simple lambda calculus (with no let bindings, just abstraction and application)?

I think there are LtUers who are better qualified than me to give up-to-date recommendations but maybe I can give a pretty good one.

I will point you to the topic of implementing functional programming languages using "graph reduction".

Maybe you can obtain or borrow a book called "The Implementation of Functional Programming Languages" by Simon L. Peyton Jones.

Graph reduction is a non-standard execution model and one that presents plenty of not-really-solved problems. So I'll also point you at:

If you haven't read this classic, and for a very approachable, down-to-earth document, I recommend the MIT AI Lab Technical Report #474 (aka "AI-TR-474") called "Rabbit: A compiler for SCHEME (A study in compiler optimization)" by Guy L. Steele.

I'm pretty sure that one is even available on-line.

Those two works are pretty historic and foundational. Tons more work has been done since both of them. In some ways, both are are "dated". Yet if you are familiar with those two works, a lot of more modern stuff is more approachable. Both may also lead to various Depe Thawts about the mysterious foundations of computing.

If you are already way more advanced than those recommendations I apologize but maybe they will benefit someone else.

Problems not Solutions

I favour Richard Feynman's approach, to start from first principles and formulate ones own solutions, and not look at other people's solutions until one has independently solved it (or failed to solve it). So at the moment I am looking for precise/concise descriptions of the problems. References to other solutions will be useful for me later of course.

I this particular case I was looking for a mathematical treatment of the required properties of a closure? It would seem that without recursive let binding (or any let binding) closures must form a directed acyclic graph in the simply typed lambda calculus? The Y-combinator would appear to form an infinite but acyclic chain of closures.

tears in the rain

I this particular case I was looking for a mathematical treatment of the required properties of a closure?

"Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory" by Joseph E. Stoy, I suppose.

So at the moment I am looking for precise/concise descriptions of the problems.

There are none.

Rabbit (and the LtU papers): Something like "The upward funargs problem and lack of lexical scope seems messy in LISP and it seems a barrier to this lambda correspondence we see. Our functional style like lots of function calls but those are traditionally expensive. Hey, tail recursion shouldn't blow out the stack but it does! By the way have you seen Stoy's new work about Dana Scott's and Christopher Strachey's work on denotational semantics? Also, optimizing compilers seem like a wide open field these days."

Peyton Jones: Something like: "The semantic beauty of languages emphasizing functional style is marred only by the use of mutation, yet attempts to provide pure languages have so far proved inefficient. Can we either fix that or figure out why it must be the case?"

Both of these guys did the work we're talking about in an era when a lot (not all) of the academic emphasis was an attempt to "elevate" computer science into some kind of elegant branch of mathematics from which, almost as an afterthought, easy and powerful engineering techniques would flow nearly effortlessly.

The two works of theirs I pointed at were early indications that the "let's do math!" strategy might pay off hugely. It's arguable that, since then, it hasn't. Thus this earlier work stands as a foundation for a lot of current practice, but not really as a convincing foundation for CS or software engineering generally.

If you propose a functional

If you propose a functional language that permits a fixed-point combinator and has an execution model based around something like finite trees representing lambda terms then, again, you very obviously don't need GC (we already knew that) but you have other hard problems to wrestle with (all kinds of performance issues, mainly).

I'm developing a language that would qualify, and I can vouch for the performance issues. Interestingly, we can regain reasonable performance by use of a little partial evaluation and interning, recognizing when a function constructs itself... but then we get the GC issues again. :)

Heavy machinery needed

To enforce that pointer does not get dereferenced after the handle has gone out of scope, you need heavy machinery. That introduces a power/complexity trade-off. The simple syntactic rule that a pointer can't leak out of the scope of the handle is not enough to express common programs. Consider a doubly linked list.

Heavy Type System

The point is a pointer cannot get dereferenced after the handle has gone out of scope, because it will be illegal (compile error) to try and return it out of that scope. This would be statically enforced by the type-system.

I was actually thinking of something simpler and Lisp like, or Haskell where pure references don't exit. A doubly linked list should not be more problematic - if you can access the list, then the pointers within it must be valid, because the list is in scope. You only need to guard against a function returning a pointer to a list node, outside of the scope of the list handle.

intra-heap pointers

While your proposal is solid in terms of objects owned by the stack, it does not explain the preferred strategy between objects on the heap. In the case of a DAG, you generally cannot statically determine which of two owning pointers will be lost first.

C++ can cheat here, since it doesn't have to prevent dangling references. In Rust a DAG will cause you to be using the GC.

Owning pointers are unique

So the idea is owning pointers are unique, and cannot be copied, so the problem you mention cannot occur.

Dangling references are prevented by not allowing ephemeral pointers to be returned to a function where an owning pointer is not in scope.

Pointers in data structures are interesting. Owning pointers are fine, but of course there can only be one. If the owning pointer can be moved, it will be, otherwise the object (not the pointer) would be copied. So cyclic data structures cannot be created by owning pointers.

I think we do not care about ephemeral pointers in data structures, as we can check any attempt to read the ephemeral pointer by making sure the owning pointer is reachable from the scope trying to read the ephemeral pointer.

Can you describe in more

Can you describe in more detail how you would handle a doubly linked list? I imagine that the forward pointers will be handles, and the backward pointers will be ephemeral references. At node n-1 we have a handle pointing to node n, and at node n+1 we have an ephemeral reference pointing to node n. How would the compiler verify that the handles outlive the ephemeral references? Seems to me that you need some kind of alias analysis / shape analysis for that.

You got me.

I don't really want to speculate too much without doing the hard work :-) but I don't think it has been proven that you can't do it. The doubly linked list is a great example because it creates problems when deleting a node, or splitting the list. I think cyclic linked lists also offer an interesting challenge.

Since you mentioned Ada's

Since you mentioned Ada's access types, here's a concise example as well. To prevent the erroneous program there, you'd have to track "origin level" (O) and "contents level" (C) as separate parameters, and at every scope exit enforce C >= O when C is the current scope being exited.

For maximum expressiveness, you also need this information in function signatures that accept access parameters, ie. the relation of arg0's O and C relative to argN's O and C, because arg0 could point to arg1's C or vice-versa at function return. If you don't include this information in signatures, you could conservatively assume that arg0's and arg1's C level is the minimum of the two inputs, and this would simply rule out some perfectly safe programs.

The more flexible you make it, the closer it gets to regions. I think in the most general case, the dynamics of something like list splitting are well captured by dynamic region inference, ie. splitting needs a union-find to determine the number of unique/owner handles needed.

Ada Problems

Access types in ada work well, but I have come across one limitation I would like to avoid. That is when passing a collection in to a function by reference (ie in "inout" mode), it should be possible to return an access to a member. Passing by reference is a kind of implicit access, but the access restrictions do not treat it as such.

I also don't like that Ada dropped support for returning by reference, as that makes any container update need to return access types and hence requires the container itself to be passed in by access type to be fast and simple - although I understand why it was done, as 'references' don't really exist in Ada.

Over this before

Haven't we been over this before? It doesn't fly, unless you severely restrict the language and pay potentially significant space overhead in the representation of certain types.

Value semantics not restrictive?

I guess it comes down to what your opinion of a severe restriction is.

Let's take untyped lambda calculus as a representative functional language, lets pass and return objects by copying, and have no references. I would not call this severely restricted, but is seems trivial to do without garbage collection as everything is on the stack and there's lots of copying.

And...

...you need whole program compilation, because you need to know the maximum size of all possible closure environments (and waste it on every closure). That is a pretty severe restriction. Similarly for other features.

Not being able to use mutable state is severe as well, of course, even if you want to be pure (I think you can't even have a state monad, let alone an efficient one).

Do you?

I don't see why you need to know the max possible size of all closures. You allocate objects on the heap with unique pointers on the stack. The pointer frees the pointed to object when it goes out of scope. The value semantics with copying is the behavioural semantics, not the implementation, which is free to internally use pointers to const objects and elide copies replacing with moves.

Even without the owner/ephemaral pointers above, you could allow references to objects to be passed into functions without any difficulty (still no pointers). You could use Ada like "in" "out" "in out" argument decorators to avoid breaking the call by value model (the user does no need to understand references). In would be read only so no copying necessary. In fact out and inout pass by reference so no copying either. Add to this the c++ copy elide for returning local objects by move semantics.

The previous discussion was in the context of statically determining polymorphism and trying to unbox values. Here I am interested in whether any functional language without GC is possible, and what it would be like.

Concatenative?

Here I am interested in whether any functional language without GC is possible, and what it would be like.

Sure, look at pure concatenative programming. When a value is dropped from the stack, it can be reclaimed immediately. As optimisations, unique values can be mutated in place, and sharing can be used to avoid copying.

sharing

Once you add sharing, you'll need GC again. :)

FP without sharing tends to hinder a lot of useful idioms that depend on sharing, such as updating persistent data structures becomes O(N) instead of a more typical O(lg(N)) or amortized O(1). Though, you might be able to limit this cost to when you actually need the prior version.

A pure functional, non-lazy language might avoid cyclic data structure. We don't even need recursive functions. Fixpoint combinators are easy to model and don't need to be primitive (e.g. `[dup quote swap compose] swap compose dup quote swap compose`).

Thus, we can have a fully acyclic functional language with explict copy and drop, suitable for working without GC or a reference counting GC. Even in this context, I'd recommend use of value sharing (i.e. on `dup`) and a generational copying collector with a per-core or per-thread bump pointer nursery.

There would be no need for Keean's ephemeral pointers in this case. (The right answer to "what about doubly linked lists" is "use a finger tree instead". Though, we could use Oleg's approach if we really insist on linked lists.)

Already on the same page.

Once you add sharing, you'll need GC again. :)

True, but GC as an optimisation is qualitatively different from GC as a requirement for correct program behaviour. Deterministic destruction with effectful destructors is nice for program structuring, but it also imposes limitations on the deallocation strategy.

A pure functional, non-lazy language might avoid cyclic data structures.

That’s what I’m betting on. The vast majority of the time, I don’t need pointer cycles. I occasionally need cycles of other kinds—graphs, repeating sequences, ring buffers—but those don’t need to interfere with the allocation strategy. Think hardlinks versus symlinks.

Fixpoint combinators…don't need to be primitive

If you have a type system that can infer dup quote swap compose, let me know. ;)

…a generational copying collector…

As long as we’re on the topic, I would really like to see per-value GC strategies, much like we already have per-value ownership and lifetime semantics. It can be, and often is done in large C++ projects—use new here, a pooled allocator there, alloca() yonder. Might as well make the runtime aware of such things in a managed language rather than pretending a GC of one size can fit all.

A bit off topic, but I've

A bit off topic, but I've always wondered how pure functional programming encodes cyclic structures and cyclic iterative computation (e.g, data flow analysis on a CFG with back branches). Is it really as simple as a Y combinator?

Z combinator

Is it really as simple as a Y combinator?

For a lazy language, yes. For a strict language, we can't really encode cyclic inductive data structures (e.g. trees or lists), but we could encode cyclic coinductive structures (e.g. streams, objects, automata) using the Z combinator.

Usually you have to use

Usually you have to use indirection. You keep a list of basic blocks with unique ids, and refer to them via the id instead of directly. i.e. roll your own pointers with reference equality.

I would expect that,

I would expect that, otherwise you couldn't have cycles at all. But what about the work list? I'm guessing the work list is input and output, and you just keep processing items on the list until it's empty?

With a lazy language you

With a lazy language you could have cycles but then you still can't run the dataflow analysis because you couldn't detect whether the node you're visiting is a new one, or one that you already visited in a cycle. You need reference equality for that, which isn't pure.

Yes, you have to get the worklist as input and instead of modifying it in place you create a new one and continue with that on the next iteration.


Structured Graphs

Another idea is in
Functional Programming with Structured Graphs by Oliveira.

In summary, you extend data types with explicit constructors for binders and variables. So for example, an immutable list has (cons x lst) and nil. To switch to a potentially cyclic stream, you add (mu v lst) for defining cycle-targets and (var v) for back-reference links. The paper uses "Higher Order Abstract Syntax" for the binders (ie. reuses the language's lambdas) and also goes on to show how to model more complex structures than lists. The most interesting bit is how the explicitness enables writing code that can reason about sharing much more easily.

In this thought framework, you can also view memory as one big letrec binder, where the left-hand sides are memory addresses. Modeling such binders as data simply lets you tighten the bounds on the scope of some particular use of mutual recursion.

Identity is still needed for

Identity is still needed for work list algorithms that might have to visit the same node multiple times. And frankly, I can't think of any other use for cyclic data structures if so e kind of "recognized" traversal isn't involved.

True, but orthogonal

You're right that the HAOS encoding does not itself afford identity. However, it does not preclude an identity mechanism.

When I experimented with this idea in a lisp context, I did not use HAOS. Instead, I encoded mu environments explicitly and used gensym to generate fresh identities for binders. That's as good as identity, but only at binders, ie the cyclic landing pads.

Being in a lisp, I also had object identity, if I needed it. Of course, Haskellers aren't as lucky. They'd have to wrap the whole thing in a unique-source monad or do some equally unpleasant with unsafe operations.

Cyclic structures in pure call-by-value

There are extensions to call-by-value that allows you to handle cyclic structures such as let ones = 1:ones in e. Common idioms such as zip [1..] xs still fall outside the scope of those extensions.

Even if I had those extensions available in my programming language I would take a writer monad as my first approach to solving your worklist example. Is there an obvious elegant solution for worklists in languages with immutable cyclic structures?

organic types

If you have a type system that can infer dup quote swap compose, let me know. ;)

While it may seem hackish and unclean, a valid and effective option is adding to your typing algorithms several specific rules and recognizers to handle fixpoint behaviors and other known safe or contextually safe (but difficult to type) idioms as they come up. The patterns in question are not primitive, but may be treated something like primitives if convenient (by compilers, too).

I understand the desire to create a small set of clean rules to infer everything. It's certainly a lot easier to standardize, which in turn is important if you want types to guide program behavior (e.g. typeclasses). But I also like the idea of pluggable type systems and linters and termination checkers and so on that can evolve independently of the language and do not limit program expressiveness. I'd like to run multiple type systems in parallel, providing edit-time feedback from all of them.

I would really like to see per-value GC strategies, much like we already have per-value ownership and lifetime semantics [..] make the runtime aware of such things in a managed language rather than pretending a GC of one size can fit all.

I'd rather just be rid of the heap vs. stack concept. I'm willing to model partitioned data in a semantic sense (e.g. data at the CPU vs. data at the GPU; data at the client vs. data at the server). We should design our languages and paradigms such that GC (and the strategy for it) can be local to each logical partition.

partitioned data

I'm willing to model partitioned data in a semantic sense (e.g. data at the CPU vs. data at the GPU; data at the client vs. data at the server).

I'm wondering, what are your thoughts on partitioned global address space (PGAS) or Active Global Address Space (AGAS) in this context (or would this be only tangentially related)?

logical and physical partitions

I'm interested in a logical partitioning of resources (in the PL layer) that can be aligned with physical partitioning (in the implementation layer). Modal types seem useful for modeling locations, i.e. where information can be constructed. (There is also a relationship to heterogeneous computing, where we model different locations as having different axioms/primitive functions.)

PGAS and AGAS seem only tangentially related.

Sounds intriguing! Are there

Sounds intriguing! (Including the relationship to heterogeneous computing.) Are there some good references you could recommend?

references

There's some good work by Jonathon Moody, e.g. slides "Type theory for mobility and locality", and a couple papers by Giuseppe Primiero. There's also work on "Modal types for mobile code" by Tom Murphy VII. And earlier works by Frank Pfenning and Rowan Davies.

Thanks!

Thanks!

Linear typing with no recursion

"Unique pointers" means everything has to be linearly typed -- that's very restrictive. You can try to write a functional program in Rust, while ignoring its Rc and Gc pointers entirely, if you want to know how the former could work, but also how limiting it is, especially for higher-order programming.

(Or did you actually mean reference counting? That of course is a form of garbage collection, just a particularly poor one.)

Copying is essentially equivalent to, or the essence of, unboxing. Problems mentioned in the previous threads included how that precludes recursion -- in the case of closures hence recursive functions in general. Wouldn't that make a rather poor functional language?

re: recursion

Problems mentioned in the previous threads included how that precludes recursion -- in the case of closures hence recursive functions in general. Wouldn't that make a rather poor functional language?

We can still model recursive behaviors via fixpoint. Syntactically, it's a bit less convenient. But in practice, you'll simply build a library of fold and traversal combinators, stream processing models, and such. And most programmers will favor these combinators rather than raw fixpoints. IMO, this makes for a fine functional programming language.

Unique pointers don't imply linear types, just that any copies are explicit. Higher order programming (including fixpoints) can be expressed by copying a closure once for each use.

Poor man's linearity

Unique pointers are a poor man's substitute for linear types. They don't necessarily require copying to be explicit.

I don't understand what you mean to say by "modelling recursive behaviors via fixpoint" -- that's how recursion is always modelled, isn't it?

recursion with names

Recursion in most functional programming languages is modeled using indirection through a naming system - e.g. you define the word 'fibonacci' in some external system, then you use the word 'fibonacci' within the definition. I suppose you could argue there's a fixpoint involved, but if so it is very implicit and indirect. By 'modeling via fixpoint' I mean explicit and direct use of, for example, a fixpoint combinator. E.g. `fibonacci = fixpoint $ \ f n -> if (n > 1) then f (n - 1) + f (n - 2) else n`.

Use of the name system to model recursion does have consequences. For example, you cannot (without diverging) simply inline a definition wherever you see a defined word. You also can't define anonymous recursive behaviors without an extra 'letrec' or similar to introduce a local namespace. You must also deal with namespaces deeply in semantics, as opposed to treating names as a separate concern in a macro or link layer.

Unique pointers are a poor man's substitute for linear types.

Unique pointers prevent aliasing. Linear types prevent copying and dropping. I suppose in functional languages, where aliasing is effectively the copying of values, these concepts might seem to overlap. But, in a general sense, neither is a substitute for the other.

Very Useful

I like the separation of concerns here. In the end, even with recursive names the behaviour is still stack bound so a unique closure pointer on the stack would work fine. Polymorphic recursion would be okay providing all the possible return types are a member of the same class, so that the function offsets in the closure would be the same.

Stylistic difference only?

Your argument about recursion seems to be a purely syntactic one. It is not clear to me that it has any particularly deep implications whether the fixpoint construct is a binder by itself, or if the binding happens inside the functional you pass to it. It is mostly a stylistic difference.

As for linearity, aliasing = copying a reference, and hence, unique pointer = linear (or rather, affine) reference. This is standard in substructural type systems with state. I'm puzzled what difference you see there, or how the correspondence is specific to functional languages.

Syntax isn't just stylistic

Syntax has a deeply structural aspect. For example, the distinction between structured programming and ad-hoc gotos is largely syntactic.

A namespace is an implicit structure in most languages. It might be difficult to appreciate the structural nature of namespaces until you've worked without one, or you've worked with many kinds (e.g. dynamic scoping vs. lexical scoping vs. flat; mutable vs. immutable namespaces; etc.). Recursion via namespace vs. fixpoint combinators has implications for dataflow, security, and our ability to robustly disentangle code from its environment (for serialization, distribution, etc.).

Regarding linearity: you earlier said "Unique pointers" means everything has to be linearly typed. And this is wrong. Relevantly, the value referenced by the unique pointer is not substructurally typed. Further, you went on to discuss a specific context - functional programming (in Rust) - in which aliasing vs. deep-copying values is only a performance distinction. In this context, unique pointers would not achieve even a significant affine property except for objects that lack an equivalence preserving copy method (which should be rare for values in FP). My earlier responses to you have been made in this context.

Borrowing, or keeping

I'd been down this road before, years ago, in the context of trying to make C++ pointer-safe via reference counting without killing performance.
It's possible, but may be too unwieldy. I called Rust's "borrow" concept "keeping". "keep" becomes a qualifier on parameters, like "const".

More specifically, function parameters which are references or pointers would have four access permissions - read, write, keep, and delete. "Read" and "Write" are implied; "const" turns off write permission. That's standard C/C++. "Keep" permission means that a function can keep a copy of a parameter after the function returns. "delete" permission means the function can delete the object pointed to.

Lack of "keep" permission has several implications. Any copy of a pointer or reference must be to a scope that will not outlive the source scope. So you can copy a non-keep pointer/reference for use in an inner block, or pass it to another non-keep parameter. Rust does much the same thing.

In a reference counted system, it is not necessary to update reference counts for non-keep pointers. They must have had a non-zero reference count at scope entrance, and they will have the same reference count at scope exit. So there's a big overhead reduction for non-keep parameters.

It's possible to infer that a local copy of a pointer is non-keep. Iterators, for example, are almost always non-keep. Recognizing this eliminates most reference counting in inner loops.

All parameters to the standard C library functions and the Linux API are "non-keep". This is also true of most math libraries. "Non-keep" is the normal case for functions.

This is thus a way to do reference counting without excessive overhead. Most of the things that are done very frequently are done via non-keep parameters to functions.

A long, long time ago I tried to talk the people who were designing what would become Java into this. I was not successful.

Sounds a bit like Baker's

Sounds a bit like Baker's scheme: Minimizing Reference Count Updating with Deferred and Anchored Pointers for Functional Data Structures, except you replace syntactic borrow/anchor binding forms with a static analysis that does the same. I'm not sure what it means to have delete permission if objects are reference counted though.

This thread has led to some interesting references!

Not really intended for functional programming.

Baker's system is more for functional programming. I was looking at this from the standpoint of how to build memory-safe low level systems. If you have functional programming and closures, you're probably in a higher level environment where garbage collection is useful and affordable.

The language safety situation at the higher levels of abstraction is in fairly good shape. It's the low-level systems programming, still done in C, where we still have vast numbers of memory-related bugs and crashes.

Low Level

I agree with this up to "systems programming". I think there are many specialised application domains where general purpose optimisation fails to produce good high performance code (and my hypothesis is that this is fundamental and unavoidable). So letting specialists write low-level code is a good idea. For example Numpi and Scipy for python are written in C and offer a significant performance advantage, probably some applications for python would not be possible if they were not.

I think there are many

I think there are many specialised application domains where general purpose optimisation fails to produce good high performance code (and my hypothesis is that this is fundamental and unavoidable). ... For example Numpi and Scipy for python are written in C and offer a significant performance advantage, probably some applications for python would not be possible if they were not.

That's a quirk of Python's "anything can change at any time" semantics. Optimizing inner loops in numerical code has been understood since the days of Backus' original FORTRAN compiler. It's also irrelevant to the GC issue.

Citation needed

Can you link to benchmarks or performance figures that show a general equivalence of speed to languages like C, Fortran or Ada. Fortran is not as far as I am aware garbage collected. Old Fortran just leaked memory, new fortran has scoped variables.

The performance problem with garbage collection is to do with the extra dereferencing to allow objects to be moved, and the boxing of values. Automatic unboxing is not as straightforward as you suggest. This cost can be significant (halving the performance of some graph algorithms).

What is the fastest garbage collected language you are aware of?

I don't see anything in

I don't see anything in Baker's system that requires or implicitly depends upon functional programing or closures. It's specifically about how to eliminate unnecessary reference count updates by programming in a linear style and exploiting an "anchored pointer" that does not require such updates. That sounds applicable to systems programming to me.

The idea there is that if

The idea there is that if you can't tell at compile time what gets deallocated at block exit, maybe you can tell at run time. "An anchored pointer is a deferred increment pointer with an additional component which indicates its anchor." Can that be done without a dynamically-sized to-do list at block exit? (Go has such a to-do list, so that their "defer" construct will work.)

Yes, the dynamic semantics

Yes, the dynamic semantics of an anchored pointer is simply a fat pointer, ie. the raw pointer paired with a stack depth. When returning an anchored value with depth X from depth N, you increment the refcount only if X > N. Furthermore, local bindings only need anchored pointers for, eg. list traversals, so no refcount mods are needed for non-keep/anchored pointer parameters.

So the to-do list size is equal to the number of parameters.

Did you ever implement this?

Did you ever implement this? I'm curious how it turned out.

Not started yet

I haven't had time to start yet, but its still on my todo list.