archives

Typing "let rec"

I was putting together a small typechecker with a friend, who argued that the most straightforward way to determine the types of members of a set of mutually recursive functions would be to initially set the types of all to _|_, then up to a fixed point update the var/type environment with what's known about the return type of each function. After that process, if any type association contains _|_ then the type check should fail.


That definition would have rejected something like let rec f x = f x, which in O'Caml has type forall A . forall B . A -> B.


So my question is, are there important functions that my friend's definition leaves out? I know that "Types and Programming Languages" would have us apply fix at the term level, but where can we go with this approach?

Mechanizing the Metatheory of LF

Christian Urban, James Cheney, Stefan Berghofer. Mechanizing the Metatheory of LF. Extended tech report accompanying LICS 2008 paper.

LF is a dependent type theory in which many other formal systems can be conveniently embedded. However, correct use of LF relies on nontrivial metatheoretic developments such as proofs of correctness of decision procedures for LF's judgments. Although detailed informal proofs of these properties have been published, they have not been formally verified in a theorem prover. We have formalized these properties within Isabelle/HOL using the Nominal Datatype Package, closely following a recent article by Harper and Pfenning. In the process, we identified and resolved a gap in one of the proofs and a small number of minor lacunae in others. Besides its intrinsic interest, our formalization provides a foundation for studying the adequacy of LF encodings, the correctness of Twelf-style metatheoretic reasoning, and the metatheory of extensions to LF.

This paper is a substantial exercise in applying the approach of Harper and Licata in Mechanizing Language Definitions reported before, and is an advance towards a worthy objective in the campaign to mechanise metatheory that Paul espouses...

Register Allocation by Proof Transformation

Register Allocation by Proof Transformation, Atsushi Ohori. ESOP 2003.

This paper presents a proof-theoretical framework for register allocation that accounts for the entire process of register allocation. Liveness analysis is proof reconstruction (similar to type inference), and register allocation is proof transformation from a proof system with unrestricted variable accesses to the one with restricted variable access. In our framework, the set of registers acts as the ``working set'' of the live variables at each instruction step, which changes during the execution of the code. This eliminates the ad-hoc notion of ``spilling''. The necessary memory-register moves are systematically incorporated in the proof transformation process. Its correctness is a simple corollary of our construction; the resulting proof is equivalent to the proof of the original code modulo the treatment of structural rules. This yields a simple yet powerful register allocation algorithm. The algorithm has been implemented, demonstrating the feasibility of the framework.

The idea that making uses of the structural rules explicit gives you proof terms to model register-memory moves is very pretty. Another fun thing to do would be to take a continuation calculus and apply the ideas here to see if it produces a good low-level IR.

EDIT: Ehud asked for some further exposition, so here goes.

At a high level, you can think of the need to do register allocation as arising from a mismatch between a programming language and the hardware. In a language, we refer to data using variables, and in any given expression, we can use as many variables as we like. However, when a CPU does stuff, it wants the data to be in registers -- and it has only a small, finite set of them. So when a program is compiled, some variables can be represented by registers, and the rest must be represented by locations in memory (usually on the stack). Whenever the CPU needs to use a variable in memory, there must be explicit code to move it from memory into a register.

The idea in this paper is to take the typing derivation of a program with an unbounded variable set, and then divide the context into two parts, one representing the register file and the other representing variables in memory. In terms of the implementation, moves between these two zones correspond to register-memory moves; and in terms of logic, this is a use of the structural rule of Exchange, which permutes the order of variables in a context.

So this gives us a high-level, machine-independent characterization of the register allocation problem: take a one-zone derivation and convert it to a two-zone derivation. But it gets even better: as long as the register allocation algorithm only adds uses of the structural rules in its transformation, we know that the meaning of the original program is unchanged -- so this method also yields a simple way of proving that a register allocation pass is semantics-preserving. (The fact that this is an easy proof is one indication of the power of this idea.)