Lambda: A Peek Under the Hood

Brian Goetz presented
Lambda: A Peek Under the Hood
on October 31st 2013

I found the interaction between the implementation and language design for lambdas interesting.
Its nice that the open ended solution with Invoke dynamic avoided over specification and allows for a good selection of future optimizations.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

nota bene

Video format, one hour, Java, lambda in Java, Oracle's Java Language Architect Brian Goetz. (Any good time marks to sample for quality?)

slides and time marks

I believe these are the same slides Lambda: A Peek Under The Hood - Brian Goetz
from JaxLondon in 2012

For interesting spots two come to mind

"Why not “just” use MethodHandle?" @ minutes 19-20 for slides 14-16

"No language feature is complete without some interaction with serialization" @ 47-48 for slides 35 & 36

Thunks expensive in Java

Did I get it right?

or maybe

Java is the Worst Form of Programming Language, Except Some Others - WSC.

How is that related?

I think that the JVM is generally considered hostile to continuations but I fail to see the relationship here, since lambda's are missing the stack copying that is needed. Wouldn't that be the larger cost?

According the the third paper on this page Asynchronous Programming, Monads and Continuations in C#, F# and Scala continuations can be made to perform reasonably well on the JVM without this.

A thunk is a closure awaiting evaluation

Well. The thunk lambda relation seems obvious. I, on the other hand, have no idea how a closure relates to a continuation.

closure vs continuation

Obviously I shouldn't have depended on my memory of that a thunk was.
I read about them in the context of lazy evaluation so they seemed more like a continuation to me but I see now they are much like generating a lambda that captures some context.

So it seems like invoke dynamic would make thunk generation easy.

continuation hostility

Thanks for the slides. Text is much better. I especially like this idea on slide 12: "Would rather not conflate implementation with binary representation." Reserving ability to change the implementation sounds good; best not to lock yourself into one technique by the way you invoke a feature.

Wikipedia defines thunk in several somewhat related ways, with common parts of gluing things together with tool-generated code (control flow shims). Typically these would be synchronous whenever code reached this way is also synchronous: running to completion before replying to a waiting caller. For an anonymous lambda that can access locals in a method creating it, a closure would be needed so those variables stay bound the same way.

A continuation isn't needed, but closures and continuations have some overlap in the semantics of the environment context, which is usually a stack of some kind. (Not using stacks makes my brain hurt, so I'll pretend that can't happen.) C and Java typically have contiguous, monolithic stacks — one per thread — but other languages put individual activation frames on the heap in (usually acyclic) trees, so multiple overlapping lists of frames can be preserved by closures and/or continuations. Early Schemes can assume this organization, called spaghetti stacks or cactus stacks. Cactus stacks don't require copying to manage continuations, but contiguous/monolithic stacks do, and that's why they penalize continuation features.

Closures are usually just about data flow: keeping variable bindings alive and in use as long as a procedure (lambda evaluation instance) is still around, using them. In scheme, if you evaluate a lambda expression and keep the result as a first class value in some variable that remains in scope, it will keep the variables alive that it uses, and potentially other state too captured at the same, if the stack environment was grabbed as part of the closure. (The easy naive thing to do is save the stack's current list of activation frames in a closure.)

Continuations are about control flow too, not just the data. Calling a first class continuation is like assigning it as a fiber to the current thread, so current execution abdicates in favor of an old point in the execution call tree represented by a continuation. You can do coroutines this way, and no stack copying is required if stacks are heap-based lists of records. It's more like copying just a jmp_buf in C, or whatever represents the PC in a language's runtime. This is why in Scheme you can go anywhere — up, down, or sideways — when calling a continuation. This behavior is basically asynchronous in the sense a continuation resumes any time it's called, and not according to some orderly nesting we associate with a single stack.

You can think of a closure as part of the data-state of a continuation: a piece of the data skeleton, where the continuation itself is the entire living organism (that might be in a suspended state if not now running). I suspect Java only has one continuation per thread: the current thread's call tree. Edit: I guess you can consider each exception catch-point still present higher on the stack as a call-once continuation.