Implementation

A Verified Compiler for an Impure Functional Language

A Verified Compiler for an Impure Functional Language

We present a verified compiler to an idealized assembly language from a small, untyped functional language with mutable references and exceptions. The compiler is programmed in the Coq proof assistant and has a proof of total correctness with respect to big-step operational semantics for the source and target languages. Compilation is staged and includes standard phases like translation to continuation-passing style and closure conversion, as well as a common subexpression elimination optimization. In this work, our focus has been on discovering and using techniques that make our proofs easy to engineer and maintain. While most programming language work with proof assistants uses very manual proof styles, all of our proofs are implemented as adaptive programs in Coq's tactic language, making it possible to reuse proofs unchanged as new language features are added.

In this paper, we focus especially on phases of compilation that rearrange the structure of syntax with nested variable binders. That aspect has been a key challenge area in past compiler verification projects, with much more effort expended in the statement and proof of binder-related lemmas than is found in standard pencil-and-paper proofs. We show how to exploit the representation technique of parametric higher-order abstract syntax to avoid the need to prove any of the usual lemmas about binder manipulation, often leading to proofs that are actually shorter than their pencil-and-paper analogues. Our strategy is based on a new approach to encoding operational semantics which delegates all concerns about substitution to the meta language, without using features incompatible with general purpose type theories like Coq's logic.

Further work on/with LambdaTamer for certified compiler development.

EASTL -- Electronic Arts Standard Template Library

The gaming studio Electronic Arts maintains their own version of the Standard Template Library. Despite the fact this is old news, I checked the LtU Archives and the new site, and there is no mention of EASTL anywhere. There are quite a few good blog posts about EASTL on the Internet, as well as the the following paper, EASTL -- Electronic Arts Standard Template Library by Paul Pedriana:

Gaming platforms and game designs place requirements on game software which differ from requirements of other platforms. Most significantly, game software requires large amounts of memory but has a limited amount to work with. Gaming software is also faced with other limitations such as weaker processor caches, weaker CPUs, and non-default memory alignment requirements. A result of this is that game software needs to be careful with its use of memory and the CPU. The C++ standard library's containers, iterators, and algorithms are potentially useful for a variety of game programming needs. However, weaknesses and omissions of the standard library prevent it from being ideal for high performance game software. Foremost among these weaknesses is the allocator model. An extended and partially redesigned replacement (EASTL) for the C++ standard library was implemented at Electronic Arts in order to resolve these weaknesses in a portable and consistent way. This paper describes game software development issues, perceived weaknesses of the current C++ standard, and the design of EASTL as a partial solution for these weaknesses.

This paper is a good introduction to a unique set of requirements video game development studios face, and compliments Manuel Simoni's recent story about The AI Systems of Left 4 Dead. This paper could be a useful inroad to those seeking to apply newer object-functional programming languages and ideas to game development.

Why Object-Oriented Languages Need Tail Calls

The Fortress blog has a recent post, Why Object-Oriented Languages Need Tail Calls, where Guy Steele argues for the necessity of proper tail call implementations without rehashing two of the classic arguments: state machines and the continuation passing style. It starts by mentioning William Cook's On Understanding Data Abstraction, Revisited:

In this blog post we extend one of his examples in order to make a completely different point: object-oriented programming languages need tail calls correctly implemented, not just as a "trivial and optional" optimization of the use of stack space that could be achieved by using iteration statements, but in order to preserve object-oriented abstractions.

The post also mentions other papers previously discussed on LtU: Automata as Macros, and A Tail-Recursive Machine with Stack Inspection.

PICBIT: A Scheme System for the PIC Microcontroller

Marc Feeley and Danny Dubé, PICBIT: A Scheme System for the PIC Microcontroller, Fourth Workshop on Scheme and Functional Programming. November 7, 2003.

This paper explains the design of the PICBIT R4RS Scheme system which specifically targets the PIC microcontroller family. The PIC is a popular inexpensive single-chip microcontroller for very compact embedded systems that has a ROM on the chip and a very small RAM. The main challenge is fitting the Scheme heap in only 2 kilobytes of RAM while still allowing useful applications to be run. PICBIT uses a novel compact (24 bit) object representation suited for such an environment and an optimizing compiler and byte-code interpreter that uses RAM frugally. Some experimental measurements are provided to assess the performance of the system.

A very interesting perspective on language implementation, found via @dhess and previous discussion.

State of the art C compiler optimization tricks

A survey about state of the art C compiler optimization tricks, Felix von Leitner, Linux Kongress 2009.

The introduction and the conclusion is quite well put:

  • Optimizing == important.
  • But often: Readable code == more important
  • Learn what your compiler does

    Then let the compiler do it.

  • If you do an optimization, test it on real world data.
  • If it’s not drastically faster but makes the code less readable: undo it.

That's certainly something that I agree with 110%. And really, that's why a good compilers course is so important, even if the vast majority of students never write a compiler outside of class.

Safe Garbage Collection = Regions + Intensional Type Analysis

Safe Garbage Collection = Regions + Intensional Type Analysis, by Daniel C. Wang and Andrew W. Appel:

We present a technique to implement type-safe garbage collectors by combining existing type systems used for compiling type-safe languages. We adapt the type systems used in region inference [16] and intensional type analysis [8] to construct a safe stop-and-copy garbage collector for higher-order polymorphic languages. Rather than using region inference as the primary method of storage management, we show how it can be used to implement a garbage collector which is provably safe. We also introduce a new region calculus with non-nested object lifetimes which is significantly simpler than previous calculi. Our approach also formalizes more of the interface between garbage collectors and code generators. The efficiency of our safe collectors are algorithmically competitive with unsafe collectors.

I'm surprised this region calculus hasn't received more attention or follow-up discussion in subsequent papers. It seems eminently practical, as it replaces the more complicated alias analyses required in other region calculi, with garbage collection of region handles; seems like a huge improvement over general GC, assuming region inference is sufficiently precise.

Lifted inference: normalizing loops by evaluation

Lifted inference: normalizing loops by evaluation. Oleg Kiselyov and Chung-chieh Shan. 2009 Workshop on Normalization by Evaluation.

Many loops in probabilistic inference map almost every individual in their domain to the same result. Running such loops symbolically takes time sublinear in the domain size. Using normalization by evaluation with first-class delimited continuations, we lift inference procedures to reap this speed-up without interpretive overhead. To express nested loops, we use multiple control delimiters for metacircular interpretation. To express loops over a powerset domain, we convert nested loops over a subset to unnested loops.

The paper is a bit hard to follow, but there are enough little tricks here to merit attentive reading. Or better yet, read the code. The basic PLT idea might be summed as doing abstract interpretation on a shallowly embedded DSL using delimited continuations.

A Verified Compiler for an Impure Functional Language

A Verified Compiler for an Impure Functional Language

We present a verified compiler to an idealized assembly language from a small, untyped functional language with mutable references and exceptions. The compiler is programmed in the Coq proof assistant and has a proof of total correctness with respect to big-step operational semantics for the source and target languages. Compilation is staged and includes standard phases like translation to continuation-passing style and closure conversion, as well as a common subexpression elimination optimization. In this work, our focus has been on discovering and using techniques that make our proofs easy to engineer and maintain. While most programming language work with proof assistants uses very manual proof styles, all of our proofs are implemented as adaptive programs in Coq’s tactic language, making it possible to reuse proofs unchanged as new language features are added.

In this paper, we focus especially on phases of compilation that rearrange the structure of syntax with nested variable binders. That aspect has been a key challenge area in past compiler verification projects, with much more effort expended in the statement and proof of binder-related lemmas than is found in standard pencil-and-paper proofs. We show how to exploit the representation technique of parametric higher-order abstract syntax to avoid the need to prove any of the usual lemmas about binder manipulation, often leading to proofs that are actually shorter than their pencil-and-paper analogues. Our strategy is based on a new approach to encoding operational semantics which delegates all concerns about substitution to the meta language, without using features incompatible with general-purpose type theories like Coq’s logic.

The latest from Adam Chlipala. Yet another evolutionary step for Lambda Tamer. Between this and Ynot the Coq/certified compiler story seems to be getting more impressive nearly daily.

Effective Interactive Proofs for Higher-Order Imperative Programs

Effective Interactive Proofs for Higher-Order Imperative Programs

We present a new approach for constructing and verifying higher-order, imperative programs using the Coq proof assistant. We build on the past work on the Ynot system, which is based on Hoare Type Theory. That original system was a proof of concept, where every program verification was accomplished via laborious manual proofs, with much code devoted to uninteresting low-level details. In this paper, we present a re-implementation of Ynot which makes it possible to implement fully-verified, higher-order imperative programs with reasonable proof burden. At the same time, our new system is implemented entirely in Coq source files, showcasing the versatility of that proof assistant as a platform for research on language design and verification.

Both versions of the system have been evaluated with case studies in the verification of imperative data structures, such as hash tables with higher-order iterators. The verification burden in our new system is reduced by at least an order of magnitude compared to the old system, by replacing manual proof with automation. The core of the automation is a simplification procedure for implications in higher-order separation logic, with hooks that allow programmers to add domain-specific simplification rules.

We argue for the effectiveness of our infrastructure by verifying a number of data structures and a packrat parser, and we compare to similar efforts within other projects. Compared to competing approaches to data structure verification, our system includes much less code that must be trusted; namely, about a hundred lines of Coq code defining a program logic. All of our theorems and decision procedures have or build machine-checkable correctness proofs from first principles, removing opportunities for tool bugs to create faulty verifications.

Adam Chlipala has been telling us how underutilized Coq's Ltac tactic programming language for proof automation is for years. Here is the... er... proof.

Certified Web Services in Ynot

Certified Web Services in Ynot

In this paper we demonstrate that it is possible to implement certified web systems in a way not much different from writing Standard ML or Haskell code, including use of imperative features like pointers, files, and socket I/O. We present a web-based course gradebook application developed with Ynot, a Coq library for certified imperative programming. We add a dialog-based I/O system to Ynot, and we extend Ynot’s underlying Hoare logic with event traces to reason about I/O behavior. Expressive abstractions allow the modular certification of both high level specifications like privacy guarantees and low level properties like memory safety and correct parsing.

Ynot, always ambitious, takes another serious swing: extracting a real web application from a proof development. In some respects the big news here is the additional coverage that Ynot now offers in terms of support for file and socket I/O, and the event trace mechanism. But there's even bigger news, IMHO, which is the subject of another paper that warrants a separate post.

XML feed