Higher-Order Programming without Closures?

The connections between higher-order functions and closures have been discussed here before, and clearly closures make higher-order programming more expressive. However, closures can sometimes make reasoning about space use and efficiency more difficult, as they penalize all higher-order function call sites (absent whole-program compilation).

I wonder if anyone has explored the space of languages with higher-order functions, but without closures. By this I mean that all functions-as-values are restricted to carry no environment, thus reducing them to C-like function pointers. We can add closures back using modules:

module type Fun =
sig
  type ('a,'b) t
  val new: ('a -> 'b) -> ('a,'b) t
  val apply: ('a,'b) t -> 'a -> 'b
end

So it's clear from function signatures whether an operation accept a Fun or accepts a function pointer. The space use and efficiency trade-offs are thus quite clear.

I don't have a good intuition as to how painful this might be to program with however. Eliminating closures clearly decreases expressiveness, but if we keep an expressive ML-like syntax with type inference, we haven't quite devolved to Java-like verbosity.

Have any languages taken this or a similar route?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Clarification, Cyclone

A small clarification. I know Cyclone uses first-class existential types to pack away an environment. I'm looking for something with a slightly friendlier syntax.

As stated...

I think it's a terrible idea :). Any code that uses raw function pointers is rendered less abstract than it ought to be, and with the explicit module syntax that you like it will be a nightmare to use Funs everywhere. I do think the observation that you can do this is a useful one though.

and with the explicit module

and with the explicit module syntax that you like it will be a nightmare to use Funs everywhere.

As I point out below, I was just reading about ATS and realized the same approach is possible in any language with abstract types. It can be made syntactically lightweight as in ATS as well, so I don't think it need be onerous. ATS just annotates functions and closure signatures:

<fun> int → int   (* function pointer *)
<clo> int → int  (* closure *)

The upside, as I explain below, is that more of the FFI and runtime can be moved into the language itself. I'm a big fan of moving as much as possible into the language to reduce implementation complexity.

The & syntax you outlined

The & syntax you outlined below wouldn't be that bad, and I appreciate the sentiment of trying to smoothly integrate lower level features. I guess I misunderstood what you were suggesting again. I don't natively speak ML (what does the type ('a, 'b) t line do?) and having been through related reasoning not too long ago I may have jumped to conclusions.

That said ... I thought one part of this suggestion was to give function types an abstract signature, so that code using functions (as opposed to creating them) could be written in a manner polymorphic in whether a function is a closure or raw pointer. What I think would be terrible is pervasive explicit passing of function modules to support this programming style.

And I'm still a little confused because if we're just talking about adding a function pointer type that will typically only be used near FFI boundaries, where do modules come into play?

I don't natively speak ML

I don't natively speak ML (what does the type ('a, 'b) t line do?)

Specifies that a module implementing the Fun signature must declare an abstract type t with two polymorphic parameters of type 'a and 'b.

I thought one part of this suggestion was to give function types an abstract signature, so that code using functions (as opposed to creating them) could be written in a manner polymorphic in whether a function is a closure or raw pointer.

Not quite. What gave you that impression? The intent was to distinguish closure-using programs from function-pointer-using programs. This could be useful for optimization and space-usage analysis purposes.

And I'm still a little confused because if we're just talking about adding a function pointer type that will typically only be used near FFI boundaries, where do modules come into play?

Modules are one way to safely hide the environment that a closure needs to carry, so that code that uses a "('a,'b) Fun.t" can be used polymorphically.

I think Scott Johnson hit the nail on the head here: this approach to closures is very close to introducing a single-method "function object", where the method signature defines the closure's free variables and the object's fields are the closure's environment.

Ok, so modules are really

Ok, so modules are really only a part of this if the language doesn't provide closures at all, right? Presumably if the language provides both then to bundle some parameters with a function pointer, you'd just use a real closure and get a real function type?

I agree there are strong similarities to OOP function objects.

Ok, so modules are really

Ok, so modules are really only a part of this if the language doesn't provide closures at all, right?

Yes. The idea is to take a higher-level functional language, and bring it down a level. Right now this closure conversion is performed implicitly, and I'm discussing how to expose it explicitly. Existential Types for Imperative Languages is relevant here as well.

OOP...

...is an obvious answer to this question. Objects are of course similar to your modules example. But objects essentially are environments.

amen!

Although not news to anyone, that's the fulcrum on which my little pet language rests.

Modules?

I don't understand how modules help here. Nor how this interface can work at all.

First, your `new' operator cannot be implemented in a library, because it needs to close over an environment a library function would know nothing about. It has to be a primitive, just as conventional function types and closures are.

Second, even it was a primitive, the argument to `new' is a function pointer. This is first-class, so at the point you apply `new' the semantics has no way of knowing under which environment you previously obtained this pointer.

Third, how are plain function pointers supposed to work? Either your language has nested scope, which means that the function will generally have dangling references of some kind. Or there is no lexical scoping, in which case Fun.t would serve no purpose.

First, your `new' operator

First, your `new' operator cannot be implemented in a library, because it needs to close over an environment a library function would know nothing about.

The environment would have to be reified as a value passed into 'new'. It can be any primitive value, like a record or tuple.

Some syntactic sugar may be possible as well. I first got this idea when reading about ATS's functions/closure distinction, and how it distinguishes from closures and plain functions in the types, and realized the same thing is achievable with just an abstract type. Languages already perform this closure conversion into a record containing the environment and a function pointer, so I'm just proposing this control be exposed to the user.

This is first-class, so at the point you apply `new' the semantics has no way of knowing under which environment you previously obtained this pointer.

I'm not sure I understand this part of your argument. There are two approaches: capture the environment and distinguish closures from function pointers merely in the type ala ATS, or the more low-level approach which is to not automatically capture the environment at all. I don't see any particular difficulty with either approach.

Either your language has nested scope, which means that the function will generally have dangling references of some kind. Or there is no lexical scoping, in which case Fun.t would serve no purpose.

I don't follow this point either. The point is that we provide some control over environment capture. As an example of one possible benefit: we can directly capture a C function pointer signature without any runtime support for carrying environments. This makes writing an FFI within the language trivial. ATS takes a similar approach.

[Edit: corrected link]

ATS vs your proposal

What ATS seems to do is introducing a separate, restricted arrow type that can classify functions with empty closure environments. The FFI would require this type to be used for any function passed to the external world. But the respective type has to be annotated at the point of function definition. Later on, you can only go from the restricted arrow type to the general one. This is easy, and I can see how it works. Note in particular that the restricted type is only ever inhabited by functions that do not require an environment. Note also that both types have to primitive in the language, because they are both tight in into the typing rules for function expressions.

You seem to be proposing something quite different. If I understand correctly, you want to be able to type any function with the restricted "pointer" type, and then somehow be able to add a closure environment on demand by explicit injection into Fun.t. I cannot see how this can be sound or implementable.

The environment would have to be reified as a value passed into 'new'. It can be any primitive value, like a record or tuple.

How do you practically reify "the environment" (in a compiled language)? The format of a closure environment is dependent on the individual function, I don't see how you can construct it independently in any practical manner.

If I understand correctly,

If I understand correctly, you want to be able to type any function with the restricted "pointer" type, and then somehow be able to add a closure environment on demand by explicit injection into Fun.t.

Depends what you mean by "any function". My "any function" means any function with no captured variables from the lexical scope, ie. no environment. This is safe from what I can see. I'm having trouble understanding the contentious issue here, so let's try an example using my abbreviated notation:

(* simple function definition with no captured environment, so it's a simple function pointer *)
fun add x y : &(int → int → int) = x + y;;
(* partial application constructs a function with environment *)
let incr : int → int = add 1;;
(* captures 'x' *)
fun foo x : int → (unit → int) = fun () → x;;
(* does not capture 'y', so safe as a function pointer *)
fun bar y : int → &(unit → int) = 1;;

The format of a closure environment is dependent on the individual function, I don't see how you can construct it independently in any practical manner.

Yes, it's function-specific. I don't understand the problem.

Relation to initial proposal

Your example seems to mirror what is described for ATS, and makes sense, as far as I can see. What I do not see, though, is how this relates to your initial proposal of defining a module Fun with an abstract type t for closures and an injection operator `new'. That's what my scepticism is directed at. Can you explain that in more detail?

The intent is that the

The intent is that the language would provide only for function pointers, and closures could be defined as a module within the language itself. Like a higher-level ML-like C. I haven't quite nailed down the implementation yet. Here's what I have so far:

module Fun =
struct
  (* hide the type of the environment 'a *)
  type ('b, 'c) t = forall 'a.('a, 'b → 'c)

  (* captures single-arg functions exposing the argument *)
  let close0 : ('a → 'c) → ('a,'c) t = fun f → (void, f)

  (* this captures single-arg functions closing over the arg *)
  let close1 : ('a → 'c) → 'a → (void,'c) t = fun f → fun env → (env, f)

  (* used to encapsulate the environment of N-arg functions *)
  let closeN : ('a → 'b → 'c) → 'a → ('b,'c) t = fun f → fun env → (env, f)

  let apply : ('b,'c) t → 'b → 'c = fun (env, f) → fun arg → f env arg
end;;

Closures are not just currying

type ('b, 'c) t = forall 'a.('a, 'b → 'c)

It has to be exists, not forall. The producer is to choose it, not the consumer.

But that's not the real problem. The real problem is that you seem to mistake currying for closures. But a closure environment doesn't just stem from curried arguments. How are you going to deal with definitions like

val n = ...
...
fun f x = x + n

Here, f has to represented by a closure with a non-empty environment.

Here, f has to represented

Here, f has to represented by a closure with a non-empty environment.

Such a definition is not possible in the low-level language, unless n and f are both top-level definitions and thus global. The behaviour is just like C.

Hmm, this seems circular.

Hmm, this seems circular. Consider the definition of apply:

let apply : ('b,'c) t → 'b → 'c = fun (env, f) → fun arg → f env arg

What do the arrow operators mean in the type of apply? This is not a simple function pointer type, because apply is closed over the module it's in.

Another unrelated comment I would make is that I don't think you should identify function pointers with closures whose environments happens to be empty - they are conceptually different (the former contains the latter + an empty environment) as well as different in implementation (since the latter will still take the environment as a hidden first parameter).

Nevermind, I see. ('b,'c) t

Nevermind, I see. ('b,'c) t was supposed to be an existential type - existentials are doing the lifting here, not modules.

Have you considered something like this? You could let 'a &-> 'b be your primitive function pointer and let 'a -> 'b be sugar for a signature containing a type 'env, a value of that type, and a function pointer of type 'env * 'a -> 'b. Change your fun primitive to generate a struct satisfying this signature by reifying the environment into a structure for you (so you get the low level access without losing too much of the power). You could require the captured variables to be explicitly stated (as in: fun {x,y,z} ...) or you could leave it implicit or support both forms. Finally, I think you'd want to add sugar making application on these things look like normal function application.

Have you considered

Have you considered something like this? You could let 'a &-> 'b be your primitive function pointer

I like that notation.

Nevermind, I see. ('b,'c) t was supposed to be an existential type - existentials are doing the lifting here, not modules.

Yes, in this implementation. Originally, I was formulating it as a Fun signature, so each closure would have to provide its own module implementation conforming to Fun, with an abstract type to hide the environment. As I'm sure you know, abstract types have existential type, so existentials are at the core here regardless. I started on a concrete module to see if I could avoid so many module duplications, but I think the wizardry that would be needed is beyond me.

You could require the captured variables to be explicitly stated (as in: fun {x,y,z} ...) or you could leave it implicit or support both forms.

This is exactly what I had in mind when I started this thread; sorry if that wasn't clear everyone! The Fun signature accomplishes this if each closure creation site implements a module closing over the environment of that site. I was thinking here of some sort of first-class modules, but first-class existentials work too.

Ok good, so I did understand

Ok good, so I did understand your original intention. But the original definition of Fun doesn't work for this purpose, right? 'new' can't work because the function pointer doesn't have the environment parameter, and 'apply' can't work because the given signature would require closure over the module (making it no longer a primitive function pointer).

Also I understand the relationship between modules, signatures and existentials - the change up just threw me off for a bit. Sorry to everyone for the high noise/signal ratio that resulted from my confusion!

Anyway, the idea sounds good. Using existentials to hide the closure type probably makes as much (or more) sense than using a module, and I'd favor structs/records over trying to encode things into tuples at the language level. That looks like a mess.

But the original definition

But the original definition of Fun doesn't work for this purpose, right?

Correct, I forgot the existentially hidden parameter for the environment. Even then there are some problems with it, and I think I'm beginning to understand Andreas' objections to it. I was more precise with my words than I was with my module signature. I guess I forgot where I was posting to. ;-)

'apply' can't work because the given signature would require closure over the module (making it no longer a primitive function pointer).

I think you mean the curried form wouldn't work. I was just using the curried form because it's what I'm used to working with OCaml. The final form would have tupled signatures. Something like:

module type Fun =
sig
  type env
  type ('b,'c) apply = env * 'b → 'c
  type ('b,'c) clo = env * ('b,'c) apply
end;;

The 'new' or 'close' functions to create an environment may have to actually be defined by the function-specific implementation of the Fun module. I think the above is more closely aligned with what I have in mind and recovers the full generality of closures. All closure-accepting functions accept a Fun.clo. The Fun.apply is a function-pointer type.

Clearly at some point in the compilation process, all functional languages reach the point where closures are represented by raw function pointers, and records/tuples. If one tries to maintain type safety throughout compilation, what is the type-safe representation? I think the above is one way to do it.

[Edit: simplified the signature. It typechecks in OCaml too.]

Ok I see - I was actually

Ok I see - I was actually talking about a different use of modules - where the module itself is the closure. In (invalid?) ML:

module type ('a, 'b) Fun =
sig
type env
val e : env
val f : env * 'a -> 'b
end;;

Would this work too? Or would there be a big overhead to this?

Modules can't have type

Modules can't have type parameters, so that signature would be rejected (the type parameters must be applied to the definitions within the module). Modules also don't have a runtime representation, which fulfills my goal of a low-level representation with a straightforward memory footprint.

Thanks


Such a definition is not possible?

Hm, OK, but then you are essentially depriving first-class functions of 90% of their utility and generality - currying really is the least interesting form of closure construction. I don't see how such a limited form of closure would warrant the complication of two arrow types, relative to the plain C way of things. What useful abstractions can you build with such a type?

(Note that ATS happily allows functions like f, in any scope.)

Hm, OK, but then you are

Hm, OK, but then you are essentially depriving first-class functions of 90% of their utility and generality

I'm not sure I understand. I'm trying to re-establish the fully generality of closures. What use-cases are ruled out?

currying really is the least interesting form of closure construction.

I'm only using curried forms of the functions because I'm used to OCaml notations, which I'm using to typecheck some of my definitions. The final implementation would have to use tuples to describe the activation frame, or some other lower-level structure with appropriate packing and unpacking primitives. Like I said, I'm trying to build higher-level abstractions from a lower-level language. I just have to figure out exactly what types of primitives are required.

For instance, a naive translation to tuples wouldn't quite work because I would need close2, close3, close4, ..., implementations to unpack larger and larger tuples, with a corresponding number of applyX functions:

module Fun =
struct
  (* hide the type of the environment 'a *)
  type ('b, 'c) t = exists 'a.('a, 'b -> 'c)

  (* 'void' being a nonexistant argument, this captures single arg functions *)
  let close0 : ('a -> 'c) -> ('a,'c) t = fun f -> (void, f)

  (* this captures single-arg functions closing over the arg *)
  let close1 : ('a -> 'c) * 'a -> (void,'c) t = fun (f, env) -> (env, f)

  (* used to encapsulate the environment of N-arg functions *)
  let closeN : ('a * 'b -> 'c) * 'a -> ('b,'c) t = fun (f, env) -> (env, f)

  let apply : ('b,'c) t * 'b -> 'c = fun ((env, f), arg) -> f (env, arg)
end;;

Curried forms ignore this problem until I have a better handle on the solution.;-)

Perhaps some form of row polymorphism to represent the portions of the record I don't care about. Any thoughts on that? I want to keep it low-level, so only basic memory operations which don't allocate should be needed.

(Note that ATS happily allows functions like f, in any scope.)

I can see how it would allow closures like f (<clo>), but not functions (<fun>). What does the in-memory representation look like? Closures are an (environment, function pointer) pair, and for functions to be meaningfully different, and usable from C, they shouldn't carry an environment.

Then again, ATS is currently a whole-program compiler, which violates the fundamental assumption of this approach. They may simply compile a different version of the function replacing the captured variables with their constant values (or extracting them into globals).

Lambda-lifting

I'm only using curried forms of the functions because I'm used to OCaml notations

My point wasn't the way you wrote the signature (there are a number of minor bugs in there). My point was rather on the conceptual level: what can you actually do with these functions?

The only way to build "closures" with this approach is to manually lambda-lift all your functions (i.e. turn all their external references into additional arguments). From that you can then construct a trivial form of closure where parts of these arguments are applied. The latter basically is an encoding of currying, or rather, partial application.

But the requirement for lambda-lifting means that you have to go through exactly the same tour de force as you have to in C if you actually wanted to do higher-order programming. The only difference would be that your closures are statically typed, while in C they have to use unsafe downcasts for the environment (usually from a void pointer). But C++ or Java can already express that in a type-safe manner with objects (as far as their type safety is trustworthy).

The ATS approach is different. It allows implicit closures as any other FPL. The only added thing is that there is a second arrow type for non-closure functions. Function expressions are basically overloaded for the two arrow types. But the restriction on external references only applies to those typed under the non-closure type. There is no particular magic here (the programmer has to annotate the type), and it has nothing to do with whole program compilation.

I think naasking's idea was

I think naasking's idea was two parts: 1) expose closures as existential types and 2) force users to build their own closures. I think you're mainly arguing against 2) here, but I think he's indicated that was more of a side idea he was toying with. Idea 1) looks more interesting with some relative advantages/disadvantages to ATS's approach.

My point wasn't the way you

My point wasn't the way you wrote the signature (there are a number of minor bugs in there).

Indeed. I revamped the signature in this post. And yes, this approach requires manually lifting captured variables to function parameters.

My point was rather on the conceptual level: what can you actually do with these functions?

I also describe a possible use in the above post. Basically, it would be nice to have a safe low-level language with a straightforward compilation to the machine. It can be used as a target for higher for higher-level languages or for FFIs, instead of always having to escape to C or other unsafe languages. There are also folks who wouldn't touch high-level languages with a ten foot pole because they like being "close to the machine". A language with this approach fills C's niche in a sense.

I was asking whether there are any languages out there taking a similar approach, other than Cyclone. I also don't mean C-dialects which insert dynamic checks to ensure memory safety, but a statically checked low-level language.

But C++ or Java can already express that in a type-safe manner with objects (as far as their type safety is trustworthy).

Exactly. I'm talking about a mid to low-level functional-style language with more explicit control. This can be used as a target for higher-level languages, or just a low-level language in its own right.

Typed closure conversion

Then maybe you want to have a look at Morrisett et at.'s paper on Typed Closure Conversion. As far as I can see, their internal language and type system is basically what you are after. (Morrisett's thesis might also be of interest.) I remain unconvinced that anybody would really want to use this low-level representation in a source language, though.

I'm sure we both find it

I'm sure we both find it unfathomable why people still use C in so much application software, yet it remains quite popular!

Thanks for the pointers. I believe Morrisett's thesis launched the "intentional type analysis" series of papers, which I'm quite familiar with.

BitC?

i don't know the details, but from the high level abstract description it sounds like BitC might be food for thought.

BitC goes quite a bit

BitC goes quite a bit further though. They're tackling theorem proving as well.

Eventually, but not soon

We're definitely going there eventually, but right now we're mainly focused on getting a usable version of the language out.

BitC doesn't lift this way.

BitC puts all the captured args into a structure-like thingy that becomes the closure record.

If you lift these things into the parameter positions, you break the ability to maintain a consistent caller-side convention that depends exclusively on the function's external signature. If you lose that, you tend to lose both the native calling convention and any reasonable approach to separate compilation.

Syntactically lighweight distinction

As I point out above, ATS makes a lightweight syntactic distinction between closures and functions in the type. As another possible syntax, consider:

int → int    (* standard closure *)
&(int → int) (* function pointer, to borrow from C convention *)

There would be an implicit conversion from function pointer to closure, but no reverse implicit conversion. This simply exposes the currently hidden behaviour of functional languages to the user.

Implementation techniques for first-class functions?

I am a bit confused, but I will throw this out here anyways: Defunctionalization at work, at least Chapter 1 is well worth reading if you haven't done so already.

it looks like you're trying

it looks like you're trying to cook up a reasonable external notation for what any modern type preserving compiler should do internally for closure conversion (ie making the environment paired with the function explicit)

The thing I don't understand is what is an actual use case where having this? Part of the benefit of having the environment implicit as is typical in functional style code is that compiler can reason about what substitutions it can do statically (eg control flow analysis and eta substitution), and an intelligent compiler will only save the parts of the environment which the closure will actually need.

More accurately the compiler will make a trade off between constant time closure creation (by just having a reference to the entire current environment that constitutes its defining lexical scope), and being safe in terms of space usage (eg traversing the entire current environment and only keeping references corresponding to the variables in the function). On average i actually don't think this is an important tradeoff.

So what sort of use case do you see, and may I see a concrete example?
Also, would your notation work for a lazy language? It doesn't look like it will.

it looks like you're trying

it looks like you're trying to cook up a reasonable external notation for what any modern type preserving compiler should do internally for closure conversion (ie making the environment paired with the function explicit)

That's precisely it. I used your exact same example in a post I just wrote above. I think the signature there more accurately reflects my purpose.

As for examples, I have nothing concrete. There is a prejudice against functional languages, sometimes due to its implicit allocations and hidden memory use. Less so today, but it's still there. Giving up control is a sin for some of these folks, so it'd be nice to show that a functional language can be made explicit wrt allocations and execution efficiency, and yet still maintain its safe, expressive nature.

Closure cost is an implementation trade-off

A language that implements closures has to trade off between the cost of *creating* a closure versus the cost of *invoking* a closure. Arguably, the invocation cost is more significant on the average. You're probably wasting cpu if you create a closure to invoke it only once.

The standard implementation of a closure tacks an environment to a function body. So the cost of invoking the closure becomes the cost of "entering" the environment before evaluating the body and "leaving" it when you're done. The cost of creating such a closure is small, but the cost of invoking it might be dependent on the size of the environment you tack on.

An alternative implementation might be to copy a function body with pre-bound values appearing as constants in it. This way, you get a function pointer (a new one) with no environment to "enter" and "leave". So the cost of closure invocation is nearly down to C levels, but the cost of closure creation is now dependent on the size of the function body.

So if you have

int add( int a, int b ) { return a + b; }

and you want to create a closure add2 which binds b to 2, you can create a new function like this -

int add2( int a ) { const int b = 2; return a + b; }

which a good compiler can transform to

int add2( int a ) { return a + 2; }

Yes, I've always assumed

Yes, I've always assumed that code-generation to create closures was too expensive. Then again, as you say, this is probably less common than invoking the closure.

Pairing a function pointer with an environment and calling it indirectly, while simple, is getting progressively more costly. CPUs have deeper pipelines, and indirect branches cannot be predicted very well in polymorphic languages. Perhaps someone knows of a way to cheaply generate code at runtime for closures to avoid branch mispredictions?

Shaders ..

"Shaders" that run on GPUs are probably the best example where even a substantial run time compilation overhead is acceptable for the performance savings of running a highly optimized "kernel" over, say, an image. In this case, the driver translates your shader into a branch optimized form for the *current* set of "uniform" parameters - essentially a closure - and sends it over to the GPU. Actually I'm not sure where these optimizations occur (GPU or CPU), but they do occur.

Compiling for Runtime Code Generation

Here's an interesting paper describing an earlier version of Cyclone with runtime codegen (RTCG). RTCG is performed by stitching together statically compiled and optimized templates and substituting values in the slots. This seems like it would be quite an efficient RTCG strategy.

I wonder how expensive closures constructed via such RTCG are when compared to closure converted functions as Andreas explained below. Closure conversion and lambda-lifting would seem to suffer from poor register use and excessive memory traffic, since environment parameters need to live in a record. If the closure is RTCG, these parameters could be folded into the generated code. This requires heap-allocation to store the new function and code GC to reclaim it, but arguably staged languages should already include these features.

Various optimizations

We spent some time thinking about this for the BitC implementation. There are a bunch of implementations, but there are a couple of things that most descriptions tend to forget to talk about, notably:

  • Most closures are trivial, in the sense that they only reference global variables. Such functions do not need a closure pointer at all.
  • The presence or absence of a closure is not part of the externally visible function signature, and consequently should not be allowed to alter the caller-visible protocol. If you let it do so, it becomes a problem when you go to talk to the native calling conventoin.
  • On many architectures, the first few parameters are registerized, and reserving a parameter slot that is nullified when no closure is required is very expensive. This induces very noticable performance loss, and breaks native calling convention. Building a stub that rewrites the frame is very tricky and it is argument type and order of appearance dependent on some architectures. Also, it's expensive.
  • There is also a strong desire not to dink with the normal calling convention when calling non-closed procedures. This makes foreign function interfaces much much easier.

The main problem is that injecting a pointer into a registerized calling convention can be hugely nasty, and it is not always possible to correctly rewrite the caller-side frame in such a way as to straightforwardly inject the closure pointer into a receiver-visible location. Also, note that the rewrite in this case to re-sort the registers at the call is pretty expensive. It may be better in this case to just have a thread-local "current closure" pointer that is a bolt-on to the native calling convention.

My point, I suppose, is that this is one of those places where a simple, portable solution works pretty well, while more aggressive solutions rapidly become highly machine dependent.

The comment someone made about prediction is wrong. All of the branches involved at the call site and the stub are forced taken. No prediction is required or useful.

Costs

Closures are constructed at runtime, so the copying has to happen at runtime, and thus can take no advantage of static compiler optimisations. Consequently, its cost will be vastly higher than the standard scheme most of the time - and calls to a copy are still indirect, unless you combine the technique with other sophisticated runtime optimisations. You have to use the instance really heavily to actually balance that cost. (FWIW, the Alice ML compiler does such runtime code specialisation, but only for functors.)

Also, I think there is some confusion in this discussion regarding the cost of closures. The dominant cost on modern CPUs indeed is the indirection through a function pointer. But you pay that regardless of whether your first-class function is actually a proper closure or just a plain pointer. On the other hand, even in FPLs you can easily optimise the cases were you statically know the callee, by avoiding that indirection. And in those cases you can also easily omit passing an environment should it actually be empty.

In summary, those cases that you can actually express in C (as plain calls) should not be any more expensive in a higher-order language with closures. You only really pay for the first-class cases (which you pay for in C as well).

But you pay that regardless

But you pay that regardless of whether your first-class function is actually a proper closure or just a plain pointer.

I would expect a closure to incur two indirect calls, not just one. The function pointer in the (environment, function pointer) pair is not to the target function, but to a function which unpacks the environment and sets up the activation frame. I would expect that all functions with the same activation layout would share the same unpacking function, which means we need another function pointer to invoke the actual target, and it costs us another indirection. Or does each target get its own unpacking function? I wonder what the space overheads are in the two approaches.

Only one indirection

There are of course many ways to implement closures, but the canonical translation is simply by lambda-lifting the environment:

\x.e   ==>  (\(x,y).e[y.i/yi], (y1,...,yN))
            where {y1,...,yN} = FV(e) - {x}
e1 e2  ==>  e1.1(e2,e1.2)

The lambda on the RHS is the only code pointer, and its application the only indirection.

This scheme is typically combined with some form of arity raising, to optimse curried functions. Moreover, with some form of subtyping or record polymorphism you can flatten the two nested tuples to avoid an additional allocation and indirection for the environment. I.e.,

\(x1,...,xN).e   ==>  (\(x1,...,xN,y).e[y.(i+1)/yi], y1, ..., yM)
                      where {y1,...,yM} = FV(e) - {x1,...,xN)
e0(e1,...,eN)    ==>  e0.1(e1,...,eN,e0)

If e0.1 is known statically you can insert a direct call. And if you know M = 0 at the call site, then you can also drop e0 from the argument list.

Synthesis

Closures are constructed at runtime, so the copying has to happen at runtime, and thus can take no advantage of static compiler optimisations.

It is possible to preserve some optimization and quick code generation by partially evaluating an optimizing compiler against the code, as in
Synthesis or Fabius.

With a slightly more interesting add function
\a -> a + 2*b
a static compiler can do constant propagation and decide to generate an immediate add of a and 2*b without knowing the value of b, just that it will be constant. The compiler could produce a template
addl %eax, $0; ret
(05 00 00 00 00 00 C3) and a closure-making function that takes b and makes a copy of the template with the constant replaced by 2*b.

I don't know how well this would work with more sophisticated optimizations, but something like GHC's rewrite rules might be possible.

makes sense to me

This is hopelessly vague but I think the idea is probably important. Naasking is presenting it from a types perspective. My angle is implementation. There's a firm, important distinction between closure-less "routines" and fn/env bundles. It's apples and oranges and it matters a lot throughout the low-levels of implementation. It's "obviously" an important distinction so, of course type theory needs an accounting of it.

I'm a type theory luddite but, as an example of a *dyanamic* language I'm working on there's basiscraft.com/fsb (link left "raw" like that since it isn't a finished work... it's just a tentative project ... don't waste your time looking for perfection or even completeness, yet.). In that system, even though the ultimate aim is quite high level programming, it's very valuable (from the instruction set design on up) to make a distinction between routines and closures.

It's an interesting period. It seems like more and more people who think about high level programming language stuff are turning their attention to the application of that knowledge to very low level "systems" type programming. How can cool stuff possibly fail to unfold over the next few years?

-t

stack based closures (I'm not sure about the terminology)

GCC extends C with stack based closures - meaning that you shouldn't return closures with non-null environment, you can only go up the stack. So we can enjoy these limited closures, and yet reasoning about space&time efficiency is easier.

example: main.c (compile with gcc -std=gnu99)

Things go especially cool when doing multi-threaded code; abstractions like Google's MapReduce, and Erlang OTP - which abstract away concurrent code - are pretty easy.

GCC nested functions

Nested functions may be useful but the limitation that they cannot be returned limits their usefulness.

My personal question is, should the variables closed on be stored in the environment as values or references (that can change value)? The former matches better with my language semantics as the functional core does (semantically) no in-place update. Eg.

def x = 1
def f = fun(->ret(@)) ret(x+1)    #note: @ is used as placeholder for a value with inferred type
.x = 5           #assign 5 to x
console!out.Printf("f returns %d", f)

That displays 2.

I am also curious about the semantics of doing "return" inside a closure, what happens if the declaring function already returned when the closure is used?

You just discovered

You just discovered one reason why implicit mutability and implicit dereferencing of mutable reference cells is bad. :-) In languages with explicit references (e.g. ML), there is no ambiguity and no surprise:

x = 1          -- immutable
f () = x+1     -- constant
y = ref 1      -- mutable
g () = !y+1    -- state dependent
z = !y         -- immutable
h () = z+1     -- constant

For languages with implicit references, closing over the state (i.e. like g above) is the natural and vastly more useful interpretation. Nevertheless, some languages get that wrong.

Re return statements: allowing return to apply to anything else but the innermost (possibly anonymous) function also is a pretty dubious idea in a higher-order language - it amounts to a limited/ad-hoc form of call-cc. See e.g. the closure proposal for Java for all the ugliness implied if you try to come up with a semantics for that.

My lambdas have no mutable element

Well in my case lambdas are not allowed to change state, so you could not write the equivalent of

g() = y <- !y+1

Given this I believe that closing on values instead of references is appropriate.

On the other hand I use explicit continuations for returning from a lambda so the enclosing function return continuation is within scope :-(
I can just disallow any continuation which is not declared in the lambda. That would be relatively coherent with the rest. I did not tackle call-cc yet, is it very useful?

innermost return

return is indeed a strange beast. It also interacts poorly with macros: if your macro expands into a nested function, then return will break out of this nested function rather than the apparent enclosing function in the source text; for example, if let expands into lambda, this function would return #f:

(lambda (x)
  (let ((y 42)) (return y))
  #f)

Capturing

In a sense, this is a capturing/hygiene problem. Return implicitly refers to a continuation captured at the entry of the surrounding lambda. Now, an implicit reference basically is equivalent to a named reference where you have only one name - and no alpha conversion. But such alpha conversion would be crucial to avoid the capturing with macros that you describe.

You could imagine a saner version of return that allows naming the continuation. You could still keep the implicit version as syntactic sugar, but its expansion would be alpha convertible and thus robust with respect to macros.

yes

This is absolutely a hygiene issue. I have more thoughts on the issue, but should probably write my dissertation first. ;)