Type Theory

Vellvm: Formalizing the LLVM Intermediate Representation for Verified Program Transformations

Vellvm: Formalizing the LLVM Intermediate Representation for Verified Program Transformations by Jianzhou Zhao, Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic, POPL 2012

This paper presents Vellvm (verified LLVM), a framework for reasoning about programs expressed in LLVM's intermediate representation and transformations that operate on it. Vellvm provides a mechanized formal semantics of LLVM's intermediate representation, its type system, and properties of its SSA form. The framework is built using the Coq interactive theorem prover. It includes multiple operational semantics and proves relations among them to facilitate different reasoning styles and proof techniques.

To validate Vellvm's design, we extract an interpreter from the Coq formal semantics that can execute programs from LLVM test suite and thus be compared against LLVM reference implementations. To demonstrate Vellvm's practicality, we formalize and verify a previously proposed transformation that hardens C programs against spatial memory safety violations. Vellvm's tools allow us to extract a new, verified implementation of the transformation pass that plugs into the real LLVM infrastructure; its performance is competitive with the non-verified, ad-hoc original.

This obviously represents huge progress in marrying the theoretical benefits of tools like Coq with the practical benefits of tools like LLVM. We can only hope that this spurs further development in practical certified software development.

Deca, an LtU-friendly bare metal systems programming language

The Deca programming language is "a language designed to provide the advanced features of sophisticated, high-level programming languages while still programming as close as possible to the bare metal. It brings in the functional, object-oriented, and generic programming paradigms without requiring a garbage collector or a threading system, so programmers really only pay in performance for the features they use." The latter link provides a list of features that Deca does, will, and won't provide. Features provided include type inference, universally- and existentially- quantified types, and "a strong region-and-effect system that prohibits unsafe escaping pointers and double-free errors".

The Deca language and ideas behind it are documented in a thesis, The design and implementation of a modern systems programming language (PDF):

Low-level systems programming has remained one of the most consistently difficult tasks in software engineering, since systems programmers must routinely deal with details that programming-language and systems researchers have preferred to abstract away. At least partially, the difficulty arises from not applying the state of the art in programming-languages research to systems programming. I therefore describe the design and implementation of Deca, a systems language based on modern PL principles. Deca makes use of decades in programming-languages research, particularly drawing from the state of the art in functional programming, type systems, extensible data-types and subroutines, modularity, and systems programming-languages research. I describe Deca's feature-set, examine the relevant literature, explain design decisions, and give some of the implementation details for Deca language features. I have been writing a compiler for Deca to translate it into machine code, and I describe the overall architecture of this compiler and some of its details.

The source code for the Deca compiler, decac, is available here. The compiler is implemented in Scala and generates LLVM bytecode. (The author points out in the comments below that this implementation is a work in progress.)

The author of Deca is LtU member Eli Gottlieb, who back in 2008 posted in the forum asking for feedback on his language: Practical Bits of Making a Compiler for a New Language.

There's some more discussion of Deca over at Hacker News.

LTL types FRP

Alan Jeffrey (to appear 2012) LTL types FRP: Linear-time Temporal Logic Propositions as Types, Proofs as Functional Reactive Programs. To be presented at next year's Programming Languages meets Program Verification, (PLPV 2012).
Functional Reactive Programming (FRP) is a form of reactive programming whose model is pure functions over signals. FRP is often expressed in terms of arrows with loops, which is the type class for a Freyd category (that is a premonoidal category with a cartesian centre) equipped with a premonoidal trace. This type system suffices to define the dataflow structure of a reactive program, but does not express its temporal properties. In this paper, we show that Linear-time Temporal Logic (LTL) is a natural extension of the type system for FRP, which constrains the temporal behaviour of reactive programs. We show that a constructive LTL can be defined in a dependently typed functional language, and that reactive programs form proofs of constructive LTL properties. In particular, implication in LTL gives rise to stateless functions on streams, and the “constrains” modality gives rise to causal functions. We show that reactive programs form a partially traced monoidal category, and hence can be given as a form of arrows with loops, where the type system enforces that only decoupled functions can be looped.
Via Alan's G+ feed.

Extensible Programming with First-Class Cases

Extensible Programming with First-Class Cases, by Matthias Blume, Umut A. Acar, and Wonseok Chae:

We present language mechanisms for polymorphic, extensible records and their exact dual, polymorphic sums with extensible first-class cases. These features make it possible to easily extend existing code with new cases. In fact, such extensions do not require any changes to code that adheres to a particular programming style. Using that style, individual extensions can be written independently and later be composed to form larger components. These language mechanisms provide a solution to the expression problem.

We study the proposed mechanisms in the context of an implicitly typed, purely functional language PolyR. We give a type system for the language and provide rules for a 2-phase transformation: first into an explicitly typed λ-calculus with record polymorphism, and finally to efficient index-passing code. The first phase eliminates sums and cases by taking advantage of the duality with records.

We implement a version of PolyR extended with imperative features and pattern matching—we call this language MLPolyR. Programs in MLPolyR require no type annotations—the implementation employs a reconstruction algorithm to infer all types. The compiler generates machine code (currently for PowerPC) and optimizes the representation of sums by eliminating closures generated by the dual construction.

This is an elegant solution to the expression problem for languages with pattern matching. This paper was posted twice in LtU comments, but it definitely deserves its own story. Previous solutions to the exression problem are rather more involved, like Garrigue's use of recursion and polymorphic variants, because they lack support for extensible records which makes this solution so elegant.

Extensible records and first-class cases unify object-oriented and functional paradigms on a deeper level, since they enable first-class messages to be directly encoded. Add a sensible system for dynamics, and I argue you have most of the power people claim of dynamic languages without sacrificing the safety of static typing.

The SAFE Platform

A. Dehon, B. Karel, B. Montagu, B. Pierce, J. Smith, T. Knight, S. Ray, G. Sullivan, G. Malecha, G. Morrisett, R. Pollack, R. Morisset & O. Shivers. Preliminary design of the SAFE platform. In Proceedings of the 6th Workshop on Programming Languages and Operating Systems (PLOS 2011). ACM, Oct. 2011.
ABSTRACT — Safe is a clean-slate design for a secure host architecture, coupling advances in programming languages, operating systems, and hardware, and incorporating formal methods at every step. The project is still at an early stage, but we have identified a set of fundamental architectural choices that we believe will work together to yield a high-assurance system. We sketch the current state of the design and discuss several of these choices.
Proving an operating system correct down to the hardware specification and against a threat model does seem to demand new programming languages and higher-order constructive type theory.

Lightweight Monadic Programming in ML

Lightweight Monadic Programming in ML

Many useful programming constructions can be expressed as monads. Examples include probabilistic modeling, functional reactive programming, parsing, and information flow tracking, not to mention effectful functionality like state and I/O. In this paper, we present a type-based rewriting algorithm to make programming with arbitrary monads as easy as using ML's built-in support for state and I/O. Developers write programs using monadic values of type M t as if they were of type t, and our algorithm inserts the necessary binds, units, and monad-to-monad morphisms so that the program type checks. Our algorithm, based on Jones' qualified types, produces principal types. But principal types are sometimes problematic: the program's semantics could depend on the choice of instantiation when more than one instantiation is valid. In such situations we are able to simplify the types to remove any ambiguity but without adversely affecting typability; thus we can accept strictly more programs. Moreover, we have proved that this simplification is efficient (linear in the number of constraints) and coherent: while our algorithm induces a particular rewriting, all related rewritings will have the same semantics. We have implemented our approach for a core functional language and applied it successfully to simple examples from the domains listed above, which are used as illustrations throughout the paper.

This is an intriguing paper, with an implementation in about 2,000 lines of OCaml. I'm especially interested in its application to probabilistic computing, yielding a result related to Kiselyov and Shan's Hansei effort, but without requiring delimited continuations (not that there's anything wrong with delimited continuations). On a theoretical level, it's nice to see such a compelling example of what can be done once types are freed from the shackle of "describing how bits are laid out in memory" (another such compelling example, IMHO, is type-directed partial evaluation, but that's literally another story).

Kleisli Arrows of Outrageous Fortune

Kleisli Arrows of Outrageous Fortune

When we program to interact with a turbulent world, we are to some extent at its mercy. To achieve safety, we must ensure that programs act in accordance with what is known about the state of the world, as determined dynamically. Is there any hope to enforce safety policies for dynamic interaction by static typing? This paper answers with a cautious ‘yes’.

Monads provide a type discipline for effectful programming, mapping value types to computation types. If we index our types by data approximating the ‘state of the world’, we refine our values to witnesses for some condition of the world. Ordinary monads for indexed types give a discipline for effectful programming contingent on state, modelling the whims of fortune in way that Atkey’s indexed monads for ordinary types do not (Atkey, 2009). Arrows in the corresponding Kleisli category represent computations which a reach a given postcondition from a given precondition: their types are just specifications in a Hoare logic!

By way of an elementary introduction to this approach, I present the example of a monad for interacting with a file handle which is either ‘open’ or ‘closed’, constructed from a command interface specfied Hoare-style. An attempt to open a file results in a state which is statically unpredictable but dynamically detectable. Well typed programs behave accordingly in either case. Haskell’s dependent type system, as exposed by the Strathclyde Haskell Enhancement preprocessor, provides a suitable basis for this simple experiment.

I discovered this Googling around in an attempt to find some decent introductory material to Kleisli arrows. This isn't introductory, but it's a good resource. :-) The good introductory material I found was this.

Asynchronous Proof Processing with Isabelle/Scala and Isabelle/jEdit

Asynchronous Proof Processing with Isabelle/Scala and Isabelle/jEdit. Makarius Wenzel, UITP 2010.

After several decades, most proof assistants are still centered around TTY-based interaction in a tight read-eval-print loop. Even well-known Emacs modes for such provers follow this synchronous model based on single commands with immediate response, meaning that the editor waits for the prover after each command. There have been some attempts to re-implement prover interfaces in big IDE frameworks, while keeping the old interaction model. Can we do better than that?

Ten years ago, the Isabelle/Isar proof language already emphasized the idea of proof document (structured text) instead of proof script (sequence of commands), although the implementation was still emulating TTY interaction in order to be able to work with the then emerging Proof General interface. After some recent reworking of Isabelle internals, to support parallel processing of theories and proofs, the original idea of structured document processing has surfaced again.

Isabelle versions from 2009 or later already provide some support for interactive proof documents with asynchronous checking, which awaits to be connected to a suitable editor framework or full-scale IDE. The remaining problem is how to do that systematically, without having to specify and implement complex protocols for prover interaction.

This is the point where we introduce the new Isabelle/Scala layer, which is meant to expose certain aspects of Isabelle/ML to the outside world. The Scala language (by Martin Odersky) is sufficiently close to ML in order to model well-known prover concepts conveniently, but Scala also runs on the JVM and can access existing Java libraries directly. By building more and more external system wrapping for Isabelle in Scala, we eventually reach the point where we can integrate the prover seamlessly into existing IDEs (say Netbeans).

To avoid getting side-tracked by IDE platform complexity, our current experiments are focused on jEdit, which is a powerful editor framework written in Java that can be easily extended by plugin modules. Our plugins are written again in Scala for our convenience, and to leverage the Scala actor library for parallel and interactive programming. Thanks to the Isabelle/Scala layer, the Isabelle/jEdit implementation is very small and simple.

I thought this was a nice paper on the pragmatics of incremental, interactive proof editing. I've suspected for a while that as programming languages and IDEs grow more sophisticated and do more computationally-intensive checks at compile time (including but not limited to theorem proving), it will become similarly important to design our languages to support modular and incremental analysis.

However, IDE designs also need more experimentation, and unfortunately the choice of IDEs to extend seem to be limited to archaic systems like Emacs or industrial behemoths like Eclipse or Visual Studio, both of which constrain the scope for new design -- Emacs is too limited, and the API surface of Eclipse/VS is just too big and irregular. (Agda-mode for Emacs is a heroic but somewhat terrifying piece of elisp.)

Type-checking Modular Multiple Dispatch with Parametric Polymorphism and Multiple Inheritance

Type-checking Modular Multiple Dispatch with Parametric Polymorphism and Multiple Inheritance. Eric Allen, Justin Hilburn, Scott Kilpatrick, Sukyoung Ryu, David Chase, Victor Luchangco, and Guy L. Steele Jr. Submitted for publication, December 2010.

Multiple dynamic dispatch poses many problems for a statically typed language with nominal subtyping and multiple inheritance. In particular, there is a tension between modularity and extensibility when trying to ensure at compile time that each function call dispatches to a unique most specific definition at run time. We have previously shown how requiring that each set of overloaded definitions forms a meet semi-lattice allows one to statically ensure the absence of ambiguous calls while maintaining modularity and extensibility.

In this paper, we extend the rules for ensuring safe overloaded functions to a type system with parametric polymorphism. We show that such an extension can be reduced to the problem of determining subtyping relationships between universal and existential types. We also ensure that syntactically distinct types inhabited by the same sets of values are equivalent under subtyping. Our system has been implemented as part of the open source Fortress compiler.

So it's Sunday, and I'm researching the interaction of multiple dispatch, type-checking, and parametric polymorphism, when Google spits out this new paper by the Fortress team.

Scott Kilpatrick's Master's Thesis Ad Hoc: Overloading and Language Design puts it into perspective:

The intricate concepts of ad-hoc polymorphism and overloading permeate the field of programming languages despite their somewhat nebulous definitions. With the perspective afforded by the state of the art, object-oriented Fortress programming language, this thesis presents a contemporary account of ad-hoc polymorphism and overloading in theory and in practice. Common language constructs are reinterpreted with a new emphasis on overloading as a key facility.

Furthermore, concrete problems with overloading in Fortress, encountered during the author's experience in the development of the language, are presented with an emphasis on the ad hoc nature of their solutions.

All of this is a bit over my head, unfortunately, but I find it very interesting nevertheless. And it's heartening that the somewhat neglected multiple dispatch (and multiple inheritance!) is further developed by such a high caliber team.

The Triumph of Types: Principia Mathematica's Impact on Computer Science

The Triumph of Types: Principia Mathematica's Impact on Computer Science. Robert L. Constable

The role the ideas of Principia Mathematica played in type theory in programming languages is often alluded to in our discussions, making this contribution to a meeting celebrating the hundredth anniversary of Whitehead-and-Russell's opus provocative.

To get your juices going here is a quote from page 3:

...I will discuss later our efforts at Cornell to create one such type theory, Computational Type Theory (CTT), very closely related to two others, the Calculus of Inductive Constructions (CIC) implemented in the Coq prover and widely used, and Intuitionistic Type Theory (ITT) implemented in the Alf and Agda provers. All three of these efforts, but especially CTT and ITT, were strongly influenced by Principia and the work of Bishop presented in his book Foundations of Constructive Analysis.
XML feed