Type Theory

Register Allocation by Proof Transformation

Register Allocation by Proof Transformation, Atsushi Ohori. ESOP 2003.

This paper presents a proof-theoretical framework for register allocation that accounts for the entire process of register allocation. Liveness analysis is proof reconstruction (similar to type inference), and register allocation is proof transformation from a proof system with unrestricted variable accesses to the one with restricted variable access. In our framework, the set of registers acts as the ``working set'' of the live variables at each instruction step, which changes during the execution of the code. This eliminates the ad-hoc notion of ``spilling''. The necessary memory-register moves are systematically incorporated in the proof transformation process. Its correctness is a simple corollary of our construction; the resulting proof is equivalent to the proof of the original code modulo the treatment of structural rules. This yields a simple yet powerful register allocation algorithm. The algorithm has been implemented, demonstrating the feasibility of the framework.

The idea that making uses of the structural rules explicit gives you proof terms to model register-memory moves is very pretty. Another fun thing to do would be to take a continuation calculus and apply the ideas here to see if it produces a good low-level IR.

EDIT: Ehud asked for some further exposition, so here goes.

At a high level, you can think of the need to do register allocation as arising from a mismatch between a programming language and the hardware. In a language, we refer to data using variables, and in any given expression, we can use as many variables as we like. However, when a CPU does stuff, it wants the data to be in registers -- and it has only a small, finite set of them. So when a program is compiled, some variables can be represented by registers, and the rest must be represented by locations in memory (usually on the stack). Whenever the CPU needs to use a variable in memory, there must be explicit code to move it from memory into a register.

The idea in this paper is to take the typing derivation of a program with an unbounded variable set, and then divide the context into two parts, one representing the register file and the other representing variables in memory. In terms of the implementation, moves between these two zones correspond to register-memory moves; and in terms of logic, this is a use of the structural rule of Exchange, which permutes the order of variables in a context.

So this gives us a high-level, machine-independent characterization of the register allocation problem: take a one-zone derivation and convert it to a two-zone derivation. But it gets even better: as long as the register allocation algorithm only adds uses of the structural rules in its transformation, we know that the meaning of the original program is unchanged -- so this method also yields a simple way of proving that a register allocation pass is semantics-preserving. (The fact that this is an easy proof is one indication of the power of this idea.)

Mechanizing the Metatheory of LF

Christian Urban, James Cheney, Stefan Berghofer. Mechanizing the Metatheory of LF. Extended tech report accompanying LICS 2008 paper.

LF is a dependent type theory in which many other formal systems can be conveniently embedded. However, correct use of LF relies on nontrivial metatheoretic developments such as proofs of correctness of decision procedures for LF's judgments. Although detailed informal proofs of these properties have been published, they have not been formally verified in a theorem prover. We have formalized these properties within Isabelle/HOL using the Nominal Datatype Package, closely following a recent article by Harper and Pfenning. In the process, we identified and resolved a gap in one of the proofs and a small number of minor lacunae in others. Besides its intrinsic interest, our formalization provides a foundation for studying the adequacy of LF encodings, the correctness of Twelf-style metatheoretic reasoning, and the metatheory of extensions to LF.

This paper is a substantial exercise in applying the approach of Harper and Licata in Mechanizing Language Definitions reported before, and is an advance towards a worthy objective in the campaign to mechanise metatheory that Paul espouses...

The Disciplined Disciple Compiler

Disciple is an explicitly lazy dialect of Haskell which includes:
  • first class destructive update of arbitrary data.
  • computational effects without the need for state monads.
  • type directed field projections.
All this and more through the magic of effect typing.

The wiki page has more information, unfortunately there's no paper yet summarizing the ideas and results. Their effect system is quite interesting. Some of the ideas recently discussed here are implemented in Disciple.

via Haskell Cafe

Uniqueness Typing Simplified

Uniqueness Typing Simplified, by Edsko de Vries, Rinus Plasmeijer, and David M. Abrahamson.

We present a uniqueness type system that is simpler than both Clean’s uniqueness system and a system we proposed previously. The new type system is straightforward to implement and add to existing compilers, and can easily be extended with advanced features such as higher rank types and impredicativity. We describe our implementation in Morrow, an experimental functional language with both these features. Finally, we prove soundness of the core type system with respect to the call-by-need lambda calculus.

Uniqueness typing is related to linear typing, and their differences have been discussed here before. Linear types have many applications. This paper describes the difference between linear and unique types:

In linear logic, variables of a non-linear type can be coerced to a linear type (dereliction). Harrington phrases it well: in linear logic, "linear" means "will not be duplicated" whereas in uniqueness typing, "unique" means "has not been duplicated".

In contrast to other papers on substructural typing, such as Fluet's thesis Monadic and Substructural Type Systems for Region-Based Memory Management, this paper classifies uniqueness attributes by a kind system. This possibility was mentioned in Fluet's thesis as well, Section 4.2, footnote 8, though the technique used here seems somewhat different.

Uniqueness typing is generally used to tame side-effects in purely functional languages. Uniqueness types have also been contrasted with monads on LTU, which are also used to tame effects, among other things. One point not discussed in that thread, is how straightforward it is to compile each approach into efficient code. It seems clear that uniqueness types have a straightforward, efficient compilation, but it's not clear to me how efficient monads can be, and how much work is required to make them efficient. This may make uniqueness or substructural types more suitable for just-in-time compilers than monads.

History of Lambda-Calculus and Combinatory logic

F. Cardone and J. R. Hindley. History of Lambda-Calculus and Combinatory logic. To appear as a chapter in Volume 5 of the Handbook of the History of Logic.

From the introduction:

Seen in outline, the history of LC and CL splits into three main periods: first, several years of intensive and very fruitful study in the 1920s and ’30s; next, a middle period of nearly 30 years of relative quiet; then in the late 1960s an upsurge of activity stimulated by developments in higher-order function theory, by connections with programming languages, and by new technical discoveries. The fruits of the first period included the first-ever proof that predicate logic is undecidable. The results of the second attracted very little non-specialist interest, but included completeness, cut-elimination and standardization theorems (for example) that found many uses later. The achievements of the third, from the 1960s onward, included constructions and analyses of models, development of polymorphic type systems, deep analyses of the reduction process, and many others probably well known to the reader. The high level of activity of this period continues today.

Beware: This is a long paper (but less than you might expect it to be by looking at the page count: about half the pages are dedicated to the bibliography).

In the announcement on the TYPES Forum the authors invited comments, suggestions and additional information on the topics of the paper, namely the development of lambda-calculi and combinatory logic from the prehistory (Frege, Peano and Russell) to the end of 20th century.

The YNot Project

The YNot Project

The goal of the Ynot project is to make programming with dependent types practical for a modern programming language. In particular, we are extending the Coq proof assistant to make it possible to write higher-order, imperative and concurrent programs (in the style of Haskell) through a shallow embedding of Hoare Type Theory (HTT). HTT provides a clean separation between pure and effectful computations, makes it possible to formally specify and reason about effects, and is fully compositional.

This was actually commented on here, by Greg Morrisett himself. Conceptually, this seems like it's related to Adam Chlipala's A Certified Type-Preserving Compiler from Lambda Calculus to Assembly Language. See, in particular, slides 23-24 of this presentation (PDF). More generally, computation and reflection seem to be gaining recognition as important features for the practical use of such powerful tools as Coq; see also SSREFLECT tactics for Coq and their accompanying paper A Small Scale Reflection Extension for the Coq system (PDF).

It's also interesting to observe that the work appears to depend on the "Program" keyword in the forthcoming Coq 8.2, the work behind which is known as "Russell," as referred to in this thread. Russell's main page in the meantime is here. Again, the point is to simplify programming with dependent types in Coq.

Update: Some preliminary code is available from the project page.

The Design and Implementation of Typed Scheme

Tobin-Hochstadt, Felleisen. The Design and Implementation of Typed Scheme. POPL 2008.

When scripts in untyped languages grow into large programs, maintaining them becomes difficult. A lack of types in typical script ing languages means that programmers must (re)discover critical piecs of design information every time they wish to change a program. This analysis step both slows down the maintenance process and may even introduce mistakes due to the violation of undiscovered invariants. This paper presents Typed Scheme, an explicitly typed extension of an untyped scripting language. Its type system is based on the novel notion of occurrence typing, which we formalize and mechanically prove sound. The implementation of Typed Scheme additionally borrows elements from a range of approaches, including recursive types, true unions and subtyping, plus polymorphism combined with a modicum of local inference. Initial experiments with the implementation suggest that Typed Scheme naturally accommodates the programming style of the underlying scripting language, at least for the first few thousand lines of ported code.

The key feature of occurrence typing is the ability of the type system to assign distinct types to distinct occurrences of a variable based on control flow criteria. You can think of it as a generalization of the way pattern matching on algebraic datatypes works.

Go to sections 4.4 and 7 to whet your appetite...

Where are all the editors? LtU needs your TLC...

J&: Nested Intersection for Scalable Software Composition

J&: Nested Intersection for Scalable Software Composition by Nathaniel Nystrom, Xin Qi, Andrew C. Myers. 2006.
We identify the following requirements for general extension and composition of software systems:
  1. Orthogonal extension: Extensions may require both new data types and new operations.
  2. Type safety: Extensions cannot create run-time type errors.
  3. Modularity: The base system can be extended without modifying or recompiling its code.
  4. Scalability: Extensions should be scalable. The amount of code needed should be proportional to the functionality added.
  5. Non-destructive extension: The base system should still be available for use within the extended system.
  6. Composability of extensions.
The first three of these requirements correspond to Wadler’s expression problem. Scalability (4) is often but not necessarily satisfied by supporting separate compilation; it is important for extending large software. Non-destructive extension (5) enables existing clients of the base system and also the extended system itself to interoperate with code and data of the base system, an important requirement for backward compatibility. Nested inheritance addresses the first five requirements, but it does not support extension composition. Nested intersection adds this capability.
Compare this approach to one taken by Scala (or read the section 7).

Closing the Stage: From Staged Code to Typed Closures

Yukiyoshi Kameyama, Oleg Kiselyov, Chung-chieh Shan. Closing the Stage: From Staged Code to Typed Closures. PEPM'08. (code)

Code generation lets us write well-abstracted programs without performance penalty. Writing a correct code generator is easier than building a full-scale compiler but still hard. Typed multistage languages such as MetaOCaml help in two ways: they provide simple annotations to express code generation, and they assure that the generated code is well-typed and well-scoped. Unfortunately, the assurance only holds without side effects such as state and control. Without effects, generators often have to be written in a continuation-passing or monadic style that has proved inconvenient. It is thus a pressing open problem to combine effects with staging in a sound type system.

This paper takes a first step towards solving the problem, by translating the staging away. Our source language models MetaOCaml restricted to one future stage. It is a call-by-value language, with a sound type system and a small-step operational semantics, that supports building open code, running closed code, cross-stage persistence, and non-termination effects. We translate each typing derivation from this source language to the unstaged System F with constants. Our translation represents future-stage code using closures, yet preserves the typing, alpha-equivalence (hygiene), and (we conjecture) termination and evaluation order of the staged program...

Essentially, representation of staged programs in an unstaged language turns out to be related to typechecking/typed compilation: typechecking is a stage. While this basic idea is straightforward, the work presented in the paper is rather intricate and depends on previous work on typed compilation. Oleg will be able to put this work in context in the forum, hopefully.

OCaml Light: A Formal Semantics For a Substantial Subset of the Objective Caml Language

OCaml Light: a formal semantics for a substantial subset of the Objective Caml language.

OCaml light is a formal semantics for a substantial subset of the Objective Caml language. It is written in Ott, and it comprises a small-step operational semantics and a syntactic, non-algorithmic type system. A type soundness theorem has been proved and mechanized using the HOL-4 proof assistant, thereby ensuring that the proof is free from errors. To ensure that the operational semantics accurately models Objective Caml, an executable version of the semantics has been created (and proved equivalent in HOL to the original, relational version) and tested on a number of small test cases.

Note that while we have tried to make the semantics accurate, we are not part of the OCaml development team - this is in no sense a normative specification of the implemented language.

From a team including Peter Sewell (Acute, HashCaml, Ott).

I continue to believe that things are heating up nicely in mechanized metatheory, which, in the multicore/multiprocessor world in which we now live, is extremely good news.

XML feed