A Verified Compiler for an Impure Functional Language

A Verified Compiler for an Impure Functional Language

We present a verified compiler to an idealized assembly language from a small, untyped functional language with mutable references and exceptions. The compiler is programmed in the Coq proof assistant and has a proof of total correctness with respect to big-step operational semantics for the source and target languages. Compilation is staged and includes standard phases like translation to continuation-passing style and closure conversion, as well as a common subexpression elimination optimization. In this work, our focus has been on discovering and using techniques that make our proofs easy to engineer and maintain. While most programming language work with proof assistants uses very manual proof styles, all of our proofs are implemented as adaptive programs in Coq’s tactic language, making it possible to reuse proofs unchanged as new language features are added.

In this paper, we focus especially on phases of compilation that rearrange the structure of syntax with nested variable binders. That aspect has been a key challenge area in past compiler verification projects, with much more effort expended in the statement and proof of binder-related lemmas than is found in standard pencil-and-paper proofs. We show how to exploit the representation technique of parametric higher-order abstract syntax to avoid the need to prove any of the usual lemmas about binder manipulation, often leading to proofs that are actually shorter than their pencil-and-paper analogues. Our strategy is based on a new approach to encoding operational semantics which delegates all concerns about substitution to the meta language, without using features incompatible with general-purpose type theories like Coq’s logic.

The latest from Adam Chlipala. Yet another evolutionary step for Lambda Tamer. Between this and Ynot the Coq/certified compiler story seems to be getting more impressive nearly daily.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Practical implicitions

Forgive my density, but I'm somewhat at a loss as to how useful and practical these achievements are. What can we do with a verified compiler that we couldn't do before?

It's not as if programs are deterministic on modern hardware with multiple threads of control with shared memory (i.e. the real world), and performance means that non-determinism is valuable (and optimizations will change timings, possibly breaking working programs even when those optimizations don't affect semantics of straight-line single-threaded code).

And the other side of the equation, given a business problem that needs solving, there doesn't seem to be much value in having a proof of the transformation of program text T to executable code C. Nobody cares about that. The people who pay the programmers care that the code appears to do what it's supposed to do, and mathematical correctness would only be a small part of that even if the acceptance criteria were specified to that detail, which they never are, as such detail would normally require more effort than creating the program in the first place.

Practical uses

This compiler could be used to implement the life support system for children on the next space shuttle.

In the intro section to his

In the intro section to his paper about Compcert C, Xavier Leroy states that it is common practice in safety-critical programming to compile with no optimizations and then go through the assembly output line by line to verify that it corresponds to the source code. In such a setting, having a trustworthy compiler would be a productivity gain.

More generally, I find this line of research exciting because compilers are big nontrivial programs: the fact that we are now able to prove their correctness shows that program verification has come a long way.


There is a sharp difference between the class of programs that are acceptable if they behave incorrectly for only a small percentage of users and programs that are unacceptable if they behave incorrectly once.

That said, multiple cores are not a get out of jail free card for bugs.

There are good reasons to employ non-deterministic programs if they produce results (in time or space constraints) that a deterministic program could not. But that's the field of probabilistic algorithms, not the practice of fumbling around with threads until a program seems to work.

Software that deadlocks or sometimes adds to an account and sometimes subtracts from it doesn't solve business problems, it creates them.

Depends on the domain

As several other people have already pointed out, the value of this particular compiler depends on the domain in which you're developing. You can't assume that the requirements and constraints that apply to the domain you work in are equally applicable to every other domain. Or, in other words, not all programming is business software development.

I would only add...

...that a lot of businesses would also like to know that their compiler won't introduce classes of errors that could end up costing them millions of dollars, whether directly or indirectly.

But what about continuations?

Adam's paper uses big-step semantics, which makes it hard to extend to languages with continuations. His previous paper used a denotational style, which didn't extend to languages with general recursion. I predict that his next paper will either use abstract machines or small-step semantics, either of which can handle just about any language.

BTW, I really like the approach to HOAS proposed in the paper. Add a little indirection and voila!