Closing the Stage: From Staged Code to Typed Closures

Yukiyoshi Kameyama, Oleg Kiselyov, Chung-chieh Shan. Closing the Stage: From Staged Code to Typed Closures. PEPM'08. (code)

Code generation lets us write well-abstracted programs without performance penalty. Writing a correct code generator is easier than building a full-scale compiler but still hard. Typed multistage languages such as MetaOCaml help in two ways: they provide simple annotations to express code generation, and they assure that the generated code is well-typed and well-scoped. Unfortunately, the assurance only holds without side effects such as state and control. Without effects, generators often have to be written in a continuation-passing or monadic style that has proved inconvenient. It is thus a pressing open problem to combine effects with staging in a sound type system.

This paper takes a first step towards solving the problem, by translating the staging away. Our source language models MetaOCaml restricted to one future stage. It is a call-by-value language, with a sound type system and a small-step operational semantics, that supports building open code, running closed code, cross-stage persistence, and non-termination effects. We translate each typing derivation from this source language to the unstaged System F with constants. Our translation represents future-stage code using closures, yet preserves the typing, alpha-equivalence (hygiene), and (we conjecture) termination and evaluation order of the staged program...

Essentially, representation of staged programs in an unstaged language turns out to be related to typechecking/typed compilation: typechecking is a stage. While this basic idea is straightforward, the work presented in the paper is rather intricate and depends on previous work on typed compilation. Oleg will be able to put this work in context in the forum, hopefully.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Typed compilation, proofs and staging

Recent discussions have brought together two seemingly unrelated areas: typed compilation and the representation of staging. By typed compilation I mean a transformation from an untyped to typed tagless representations. The untyped form is an AST (represented as a regular data type), which is usually the result of parsing a file or similar plain data. The typed tagless representation takes form of either generalized algebraic data types GADT (the initial approach), or alternatively, type-constructor--polymorphic terms (the final approach). Either type representation can be interpreted in various ways (e.g., evaluated, CPS-transformed, partially evaluated, etc). All these interpretations are assuredly type-preserving and patently free of any `type errors' such as failure to pattern-match type tags or reference to an unbound variable. The freedom from pattern-match failures is syntactically apparent: the representation just has no type tags at all, corresponding to closed code by construction. Therefore, the typed representations are handled internally as if no types existed: indeed, after typechecking all the types can be erased.

Typed compilation is a generalization of typechecking (the latter merely verifies if an expression is well-typed). Typed compilation is related to dynamic loading of (typed) code and is a part of implementing typed DSL. Typed compilation is inherently a partial operation (the source term may be ill-typed) and it has to deal with (run-time) type representations. The result of compilation, however, has no type-related partiality and needs no type information. `Typecheck once, assuredly interpret many times' is our main goal. It is achievable, albeit not simply. We follow the seminal work of Baars and Swierstra Typing Dynamic Typing. Given below are the applications of the ideas of that paper:

Implementing a typed DSL with the typed evaluator and the the typed compiler from untyped terms to GADT

Implementing a typed DSL with the typed compiled evaluator and the the typed compiler from untyped terms to GADT. We use template Haskell to `compile' GADT to `machine code'

Typed compilation from untyped terms to the Tagless Final representation (which can be evaluated by all the interpreters in the Tagless Final paper)

It is the latter typed compiler that shows uncanny similarities with the staging. In the hindsight, the connection is obvious: typechecking is a stage in the evaluation of a source code expression. The connection is deeper than the mere conceptual correspondence, and is closely related to a different paper on the representation of staging in an unstaged language (described above). To type-compile an untyped term to a higher-order abstract syntax, we need staging, or its emulation. Both type-checking and staging have to manipulate open code, whose free variables are hypotheses, or type abstractions. Another remarkable part of the tagless final compilation is its close relationship with proofs in logic. Typechecking is truly building a proof, using hypothetical reasoning. Moreover, our type compiler must be explicit with weakening (to extend the typing environment) and the application of structural rules (which take the form of cut-rules). The result of type compiling of the body of the function happens do be the same as that of compiling the whole function, which is the statement of the Deduction Theorem. The result of our typed compilation includes many administrative redices, which would be absent had we used genuine staging. One may consider genuine staging to be `cut elimination' with respect to structural rules.

This is a fascinating read,

This is a fascinating paper, and even after two weeks, and multiple reads, I'm still struggling with it. It's to be expected though, and I've already garnered some very useful insights.

I think the tagless approach is significant, and this paper is an important exploration in the same space, and my enthusiasm has absolutely nothing to do with the fact that Oleg actually cited my post in that thread. Nope, none whatsoever. ;-)

In fact, I just posted about my tagless embedding of the Orc language in C#. The tagless approach seems to has a close relationship with combinators, and I used that to good effect in my embedding of Orc' combinators.

Even more fun

If you have understood all of the above, then there are a few further obvious remarks still worth making: in a staged environment, most type-level trickery is completely unnecessary, since ordinary terms can be used as witnesses for properties. In effect, you get arbitrary computations at the type (or kind or ...) level ``for free''.

This does not quite mean that you get dependent types for free. This is also obvious: general dependent types must exist, somehow, at run-time [think of the case of Matrices, where the size usually needs a run-time representation]. What you do get for free are all properties which are statically decidable. The interesting question then becomes, exactly how much type erasure can you still do? [The answer is obvious, but uncomputable: anything knowable statically can be erased; the hard part is to figure out decision procedures that are powerful and useful].

Typed Compilation - IncopeTypecheck.hs

I'm working through the IncopeTypecheck.hs, but being an ML guy, I'm having a bit of a hard time with the Haskell and some of the terminology. For instance, "weakening" is used quite a bit, but I don't quite understand what's going on there, so I'm hoping someone will take pity on me. :-)

[...] the typechecking of (UInt 1) gives DynTerm (int 1), in the empty environment. Since the type-checking env includes "x" of the type int, we must weaken the result, giving us DynTerm (lam (\x -> int 1)). (UVar "x") in the env [("x",int)] is translated to DynTerm (lam (\x -> x)).

From this and other examples, I get the impression that expressions accessing environment values get lifted into a lambda and the environment value is passed in as an argument to the lambda.

Unfortunately, the type of these arguments and the relationship to the type in the environment isn't clearly discussed, so the subsequent example confuses me:

We now need to "add" the two results. First, we weaken the "add" function from the empty environment to the current environment, [("x",int)]. We obtain lam (\u -> lam (\v -> lam (\x -> add (u x) (v x)))). This is the _total_ term. To make the body of the function, we add the applications:

DynTerm (
          lam (\u -> lam (\v -> lam (\x -> add (u x) (v x))))
          `app` (lam (\x -> x))
          `app` (lam (\x -> int 1)))

This is the result of typechecking _the_ body of the function, (UAdd (UVar "x") (UInt 1)) in the environment [("x",int)].

What are the types of u, v and x? I'm guessing that u = \x -> x, and that v = \x -> int 1, and that x is some sort of environment value. Assuming the type I inferred for u is correct, then x is an int. Is that correct?

IncopeTypecheck.hs: building witnesses with weakening

Weakening refers to the principle (valid in classical and intuitionistic logics) that a derived conclusion continues to hold if more assumptions are added. Weakening is so `obvious' that it is normally elided in derivations. Our typechecker however constructs not only a type derivation but also a proof object; construction of the latter forces us to face weakening.

Let's take a simple example, of a term \x:Int. x + 1 in simply-typed lambda calculus with integer literals and addition. The type-checking derivation can be written as follows.

                    [] |- 1:Int     []  |- +: Int -> Int -> Int
                 --------------   -----------------------------
x:Int |- x:Int   x:Int |- 1:Int   x:Int |- +: Int -> Int -> Int
x:Int |- x + 1: Int
[] |- (\x:Int. x+1): Int
There are two weakening steps, from [] |- 1:Int to x:Int |- 1:Int and the same for plus. Let us now build proof-terms for the above derivation. We build proof terms by composing terms in a typed functional language (think ML). A proof term corresponds (or, describes, witnesses) a type judgment. A type judgment
  x:t1, y:t2, z:t3 |- e: t
is witnessed by a term of the type t1 -> t2 -> t3 -> t. The judgment [] |- 1:Int is witnessed by the proof term (int 1). The judgment [] |- +: Int -> Int -> Int is witnessed by the proof term (fun u -> fun v -> add u v) where add is a pre-defined constant of the appropriate type. The judgment x:Int |- x:Int is witnessed by a term fun (x:Int) -> x. It clearly has the required type. To proceed further, we need to build a witness for the judgment x:Int |- 1:Int. We build it by weakening the judgment [] |- 1:Int and its proof term int 1. The result: fun (x:Int) -> int 1.

Now, we need to build the witness for the typing of plus in the environment x:Int. The proof term (fun u -> fun v -> add u v) for plus in the empty environment says that given two closed expressions u and v of the type Int, we can build another expression of the type Int, their sum. We have to figure out how to add two open expressions u' and v', which may contain a free variable of the type Int. Each such expression has the type Int->Int. It is easy to see that the desired witnessed is as follows

	fun (x:Int) -> fun (u':Int->Int) -> fun (v':Int->Int) ->
	(fun u -> fun v -> add u v) (u' x) (v' x)
We are now in a position to build a witness for the judgment x:Int |- x + 1: Int. We combine the three witnesses we just derived, resulting in
fun (x:Int) -> 
    (fun (x:Int) -> fun (u':Int->Int) -> fun (v':Int->Int) ->
	(fun u -> fun v -> add u v) (u' x) (v' x)) 
    x (fun x -> x) (fun x -> int 1)
This turns out to be the witness for the final judgment [] |- (\x:Int. x+1): Int, and the result of the typed compilation.

Thanks for the great

Thanks for the great explanation Oleg. What concerns me is that every environment variable creates a new outer lambda, regardless of whether it's captured in an expression, ie. x:Int |- 1:Int ==> fun x -> int 1. For a trivial environment, this isn't a big issue, but environments can be fairly large in real programs, so this quickly becomes a large overhead for a real interpreter. Since I'm playing with a toy interpreter, I'm trying to properly understand how this is supposed to work in order to implement it.

The ideas I had to address this overhead:

1. Perhaps the environment can be minimized somehow.
2. Partial evaluation to compile away the unreferenced outer lambdas.
3. Compile the witnesses to a record calculus instead of the lambda calculus. The variable reference is then simply an offset into a record, whose overhead is thus acceptable.
4. Instead of nested curried lambdas, build a single tupled outer lambda, which yields a more efficient value access in the spirit of the option #3. Some way to minimize the environment would also be useful, as the tuple would then simply correspond to the function stack frame.

Any suggestions would be much appreciated, particularly if I'm on completely the wrong track.

Cut-elimination, typed compilation via record calculus

I think I'm starting to understand better how this all works, and how it relates to cut-elimination.

To expand on the idea of compilation to a record calculus, assume a record selection operator ".", and a record extension operator "|". Let "r" stand for row variables, and "_" for the empty record. Record extension then replaces lambda abstraction (for weakening), and record selection replaces application, in the typed compilation derivation Oleg described.

For the empty environment, the derivation for a record calculus then becomes:

[] |- 1:Int
env:{r} -> (int 1)

[] |- +:Int -> Int -> Int
env:{r | u:Int | v:Int} -> add env.u env.v

With an environment x:Int:

x:Int |- x:Int
env:{r | x:Int} -> env.x

x:Int |- 1:Int
env:{r | x:Int} -> int 1

x:Int |- +:Int -> Int -> Int
env:{r | x:Int | u:Int | v:Int} -> add env.u env.v

And the final derivation:

x:Int |- x + 1:Int
env:{r | x:Int} ->  (env:{r | x:Int | u:Int | v:Int} -> add env.u env.v) (env | env.x | int 1)

I'd appreciate any corrections to my derivation. :-)

We must pass offsets at runtime for constant time field access. Of course, the CPU already carries this offset for us: it's the stack register, and the environment is the stack. Record extension is thus pushing a stack frame, and record selection is accessing a stack variable. This would seem to support an efficient representation of typed compilation, without the overhead of all of these administrative redices.


x:Int |- +:Int -> Int -> Int
env:{r | x:Int | u:Int | v:Int} -> add env.u env.v

Can you explain what's going on? If u and v aren't functions of x then what's the point of all of this?

Also you're using 'env' twice in the last derivation.

Can you explain what's going

Can you explain what's going on? If u and v aren't functions of x then what's the point of all of this?

Oops, you're right. The correct type should be:

x:Int |- +:Int -> Int -> Int
env:{r | x:Int | u':{r | x:Int}->Int | v':{r | x:Int}->Int} ->
  (env:{r | u:Int | v:Int} -> add env.u env.v) (env.u' env) (env.v' env)

x:Int |- x + 1:Int
env:{r | x:Int} ->
  (env:{r | x:Int | u':{r | x:Int}->Int | v':{r | x:Int}->Int} -> add (env.u' env) (env.v' env))
  (env | fun env -> env.x | fun env -> int 1)

I completely forgot the purpose of the "open expressions" u' and v'. So this technique doesn't really save on administrative redices, it's just more explicit with stack handling as an extensible record.