Lambdas and objects as an existential type

I’m looking at type theory from point of view of the general-purpose programming language evolution, and I’ve found something that does not look nice.

There is stage when function pointers like ones in C were introduced to the languages. Differently from lambdas they are stateless and they could be only evaluated using their explicit arguments and using only constants (or global state) and by calling other stateless functions. Their type will be designated as:

T ⇨ A

Now let’s take usual lambdas, their type will be designated as:

T → A

The relationship between these two types is the following:

A → B = ∃ T (T ⨉ (T ⨉ A ⇨ B))
apply:  (A → B) ⨉ A ⇨ B = ? ((t, f), a), f(t, a) // unpacking existential type here
curry: ((A ⨉ B) → C) ⇨ (A → B → C) = ?(t, f) ((t, f), (?((t1, f1),  a1) ((t1, f1, a1),  (?((t2, f2, a2), b2)  f2(t2, (a2, b2))))))

Such definition is needed to support currying and to capture implicit environment, and it reflects what is actually happens in the code. Every function from T ⇨ A could be trivially converted to T → A, however stateless functions have different set operations supported. Particularly, currying is not supported. So, it is not subclass, but a separate entity type.

This existential type captures difference between structured programming paradigm and object-oriented or functional programming paradigms. So, theoretically, when we would try to study paradigms, we would also need to make this distinction.

I’m interested if there are some research papers that describe this distinction. The discussion in the TaPL book is somewhat unsatisfying because explicit existential types are defined using implicit existential types. Theoretically, the type theory should start from stateless functions to reflect evolution path rather than in reverse direction. Such point of view might be also useful for the compiler transformation reasoning, so there might be papers on such equivalence in that area too.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

It's a useful observation

I followed this same line of reasoning to distinguish closures from contextless functions in hobbes (a PL/database project we use at Morgan Stanley), where closures have exactly the existential type that you describe.

I'm not sure that a research paper is warranted for this observation, but there are some interesting complications.

For example, it's not necessarily the case that currying can't be done but just that the type relationship isn't exactly the same as for closures. That is, a curried contextless function must be a closure (just as a curried closure is).

To address this kind of thing in hobbes (where we are especially sensitive to runtime performance), my approach so far has been to use type classes to capture these kinds of type relationships and to require that qualifications in types (e.g. via type classes) are fully eliminated at compile-time.

Currying within the class A ⇨ B

What I meant, currying is not possible while staying within the class A ⇨ B:

curry_not_possible: ((A ⨉ B) ⇨ C) ⇨ (A ⇨ B ⇨ C) = ???

Actual runtimes sometimes do the trick with code generation to achieve this, but it is escaping the paradigm through low level code. What you have described, is lifting stateless function to closure than performing currying on closures (then optimizing a bit):

lift: (A ⇨ B) ⇨ (A → B) = ?f (f, (?(f1, a) f1(a))) 

I think a research paper on the topic would have been useful with at least with the following content, so someone on the site could even bother to write it:

  • Formal class definition for A ⇨ B
  • Differences between classes A ⇨ B and A → B
  • Formal translation specification between classes, relationship with lambda-calculus theorems
  • Relationship to von Neumann architecture

The article would have been a quite good. If one gets serious, it could even grow to a PhD thesis.

The meaning of ->

I'm definitely interested if you feel that there's more to say about it, sorry if I sound negative. I think that the type "exists E.((E*a)->b)*E" (which you yourself identified) sort of _is_ the fundamental theorem or relationship between these contextless functions and closures.

In terms of the relationship to the von Neumann architecture, I am guessing that you mean that contextless functions can be represented by a pointer to code (in contrast to closures which are doubly-indirect and so a bit more expensive to invoke).

I suppose that the closure type also works as a "least among many" carrier of function types the same way that products and sums do for tuples and tagged unions (so that e.g. all types from which an int and double can be projected can be mapped to int*double).

It's important to distinguish contextless functions and closures, I will grant you that. And certainly many PLs don't allow you to make that distinction (and the ones that do are considered "ugly" in some way).

Typed closure conversion

Your "stateless functions" are often called "function pointers" in the literature, because they can be implemented just like function pointers in C -- because you don't need to allocate a closure for them, you can just pass the bare address of the function around. As a result, this kind of function is often introduced as part of the closure-conversion pass of a compiler in functional programming languages, thereby decomposing function types into simpler, easier-to-implement bits.

See, for instance, the classic POPL 1996 paper by Yasuhiko Minamide, Greg Morrisett and Robert Harper, Typed Closure Conversion.

Thanks, this is just a link

Thanks, this is just a link I wanted.

Closure Conversion

As you may be aware, the process of lowering implicit closures into explicit closures is typically called closure conversion, and there’s a wealth of information available about it online. You can certainly make the existentials explicit in your internal representation or even surface them to the programmer. For example, something like this (using ⇒ to denote function pointers, and assuming they can take multiple arguments directly):

curried_add : int → int → int
curried_add = λx. λy. x + y

main = print(curried_add 1 2)

⇓

curried_add : (int) ⇒ ∃C. C × ((C, int) ⇒ int)
curried_add(x) = pack int as ∃T. T × ((T, int) ⇒ int) in pair(x, lambda1)
  where lambda1(closure1, y) = closure1 + y

main = print(unpack (c, f) = curried_add(1) in f(c, 2))

I’m doing this in my language, Kitten, to differentiate between boxed and unboxed closures.

Of course in this case we could lower straight to curried_add(x, y) = x + y and only actually allocate a closure when curried_add is not fully applied.

The same transformation works for objects in an object-oriented language, where the vtable is existentially quantified; or for “trait references” in a language like Rust, where the existential is on the reference instead of being embedded in the object, implemented as a “fat pointer” consisting of a pair of a vtable pointer and an object pointer.