Gradual Typing for Functional Languages

Gradual Typing for Functional Languages

Static and dynamic type systems have well-known strengths and weaknesses, and each is better suited for different programming tasks. There have been many efforts to integrate static and dynamic typing and thereby combine the benefits of both typing disciplines in the same language. The flexibility of static typing can be improved by adding a type Dynamic and a typecase form. The safety and performance of dynamic typing can be improved by adding optional type annotations or by performing type inference (as in soft typing). However, there has been little formal work on type systems that allow a programmer-controlled migration between dynamic and static typing. Thatte proposed Quasi-Static Typing, but it does not statically catch all type errors in completely annotated programs. Anderson and Drossopoulou defined a nominal type system for an object-oriented language with optional type annotations. However, developing a sound, gradual type system for functional languages with structural types is an open problem.

In this paper we present a solution based on the intuition that the structure of a type may be partially known/unknown at compile-time and the job of the type system is to catch incompatibilities between the known parts of types. We define the static and dynamic semantics of a λ-calculus with optional type annotations and we prove that its type system is sound with respect to the simply-typed λ-calculus for fully-annotated terms. We prove that this calculus is type safe and that the cost of dynamism is “pay-as-you-go”.

In other news, the Holy Grail has been found. Film at 11.

This piece of genius is the combined work of Jeremy Siek, of Boost fame, and Walid Taha, of MetaOCaml fame. The formalization of their work in Isabelle/Isar can be found here.

I found this while tracking down the relocated Concoqtion paper. In that process, I also found Jeremy Siek's other new papers, including his "Semantic Analysis of C++ Templates" and "Concepts: Linguistic Support for Generic Programming in C++." Just visit Siek's home page and read all of his new papers, each of which is worth a story here.

Lightweight Static Capabilitites (II)

The slides for the talk discussed here are now available for download. Like all of Ken and Oleg's work, this stuff is both cool and important.

Keep in mind that the talk is quite different from the paper. The safety claims were formalized and proved in Twelf: list example, array example. To follow the proofs you should read them alongside the presentation slides. I am told that the first file might change soon, to reflect a more general proof. Perhaps Ken or Oleg would like to comment on the experience of doing mechanized proofs, a subject the comes up regularly on LtU.

LtU newcomers, who already managed to take in a little Haskell or ML, may want to spend a little time chewing on this, and ask questions if something remains unclear, since this work may eventually have practical importance, and can teach quite a few interesting techniques.

The Daikon Invariant Detector


Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants. An invariant is a property that holds at a certain point or points in a program; these are often seen in assert statements, documentation, and formal specifications. Invariants can be useful in program understanding and a host of other applications. Examples include “.field > abs(y)”; “y = 2*x+3”; “array a is sorted”; “for all list objects lst, = lst”; “for all treenode objects n, n.left.value < n.right.value”; “p != null => p.content in myArray”; and many more. You can extend Daikon to add new properties (see Enhancing Daikon output).

Dynamic invariant detection runs a program, observes the values that the program computes, and then reports properties that were true over the observed executions.

Daikon can detect properties in C, C++, Java, Perl, and IOA programs; in spreadsheet files; and in other data sources. (Dynamic invariant detection is a machine learning technique that can be applied to arbitrary data.) It is easy to extend Daikon to other applications; as one example, an interface exists to the Java PathFinder model checker.

I spend a lot of time here talking about static typing, but let's face it: most often we're dealing with existing code that probably isn't written in a language with a very expressive type system, and rarely has been formally specified, whether through an expressive type system or otherwise. Daikon is interesting because it attempts to learn important properties of a piece of software by execution and observation. Combine it with a model checker like Java PathFinder, and you have an unusually powerful means of evolving the correctness of even quite complex code. There's also a relationship to the already-mentioned JML and ESC/Java 2, which in turn has a connection to the popular JUnit unit-testing framework. In short, the gap between compile time and runtime, and static vs. dynamic typing, seems to be narrowed in powerful ways by these, and related, tools.

Interface Automata

Interface Automata
by Luca de Alfaro, Thomas A. Henzinger

Conventional type systems specify interfaces in terms of values and domains.
We present a light-weight formalism that captures the temporal aspects of software
component interfaces. Specifically, we use an automata-based language to capture both
input assumptions about the order in which the methods of a component are called,
and output guarantees about the order in which the component calls external methods.
The formalism supports automatic compatibility checks between interface models, and
thus constitutes a type system for component interaction. Unlike traditional uses of
automata, our formalism is based on an optimistic approach to composition, and on
an alternating approach to design refinement. According to the optimistic approach,
two components are compatible if there is some environment that can make them work
together. According to the alternating approach, one interface refines another if it
has weaker input assumptions, and stronger output guarantees. We show that these
notions have game-theoretic foundations that lead to efficient algorithms for checking
compatibility and refinement.

The idea of expressing order of message exchange as type is certainly not new (as anyone exposed to web service choreography hype can tell - oh, just kidding, of course the theory is much older). However, the specific approach looks interesting (not the least because of appealing to game semantics).

Lightweight Static Capabilities

Lightweight Static Capabilitites

We describe a modular programming style that harnesses modern type systems to verify safety conditions in practical systems. This style has three ingredients:

  1. A compact kernel of trust that is specific to the problem domain.
  2. Unique names (capabilities) that confer rights and certify properties, so as to extend the trust from the kernel to the rest of the application.
  3. Static (type) proxies for dynamic values.

We illustrate our approach using examples from the dependent-type literature, but our programs are written in Haskell and OCaml today, so our techniques are compatible with imperative code, native mutable arrays, and general recursion. The three ingredients of this programming style call for (1) an expressive core language, (2) higher-rank polymorphism, and (3) phantom types.

Pursuant to this thread about the membrane pattern in static languages from Mark Miller's excellent Ph.D. thesis. I don't yet know whether a solution is derivable from this work, but Mark was kind enough to point me to it, and Oleg seems to want to see it distributed, so here it is—Mark and/or Oleg, please let me know if this is premature.

Concoqtion: Mixing Indexed Types and Hindley-Milner Type Inference

From the "Whoa!" files:

Concoqtion: Mixing Indexed Types and Hindley-Milner Type Inference

This paper addresses the question of how to extend OCaml’s Hindley-Milner type system with types indexed by logical propositions and proofs of the Coq theorem prover, thereby providing an expressive and extensible mechanism for ensuring fine-grained program invariants. We propose adopting the approached used by Shao et al. for certified binaries. This approach maintains a phase distinction between the computational and logical languages, thereby limiting effects and non-termination to the computational language, and maintaining the decidability of the type system. The extension subsumes language features such as impredicative first-class (higher-rank) polymorphism and type operators, that are notoriously difficult to integrate with the Hindley-Milner style of type inference that is used in OCaml. We make the observation that these features can be more easily integrated with type inference if the inference algorithm is free to adapt the order in which it solves typing constraints to each program. To this end we define a novel “order-free” type inference algorithm. The key enabling technology is a graph representation of constraints and a constraint solver that performs Hindley-Milner inference with just three graph rewrite rules.

Another tough-to-categorize one: dependent types, the Curry-Howard Correspondence, logic programming, theorem provers as subsystems of compilers, implementation issues... it's all in here.

Update: A prototype implementation is available here, but it took a bit of Google-fu to find, and it's brand new, so be gentle.

Update II: The prototype implementation isn't buildable out of the box, and includes a complete copy of both the Coq and O'Caml distributions, presumably with patches etc. already applied. So it's clearly extremely early days yet. But this feels very timely to me, perhaps because I've just started using Coq within the past couple of weeks, and got my copy of Coq'Art and am enjoying it immensely.

Update III: It occurs to me that this might also relate to Vesa Karvonen's comment about type-indexed functions, which occurs in the thread on statically-typed capabilities, so there might be a connection between this front-page story and the front-page story on lightweight static capabilities. That thought makes me happy; I love it when concepts converge.

Socially Responsive, Environmentally Friendly Logic

Socially Responsive, Environmentally Friendly Logic
by Samson Abramsky

We consider the following questions: What kind of logic has a natural semantics in
multi-player (rather than 2-player) games? How can we express branching quantifiers, and
other partial-information constructs, with a properly compositional syntax and semantics?
We develop a logic in answer to these questions, with a formal semantics based on multiple
concurrent strategies, formalized as closure operators on Kahn-Plotkin concrete domains.
Partial information constraints are represented as co-closure operators. We address the
syntactic issues by treating syntactic constituents, including quantifiers, as arrows in a
category, with arities and co-arities. This enables a fully compositional account of a wide
range of features in a multi-agent, concurrent setting, including IF-style quantifiers.

This paper seems to unify multiple interesting directions - logic, game semantics, concurrent constraint programming (and concurrent programming in general).

At the same time it remains very accessible, without overwhelming amount of math, so can be hopefully useful not only for academics. I, for one, was waiting for exactly this kind of paper for two years (and my interest is very practical).

Multiplayer Curry-Howard correspondence, anyone? Or Curry-Howard for web services?

Abstracting Allocation: The New new Thing

Abstracting Allocation: The New new Thing. Nick Benton.

We introduce a Floyd-Hoare-style framework for specification and verification of machine code programs, based on relational parametricity (rather than unary predicates) and using both step-indexing and a novel form of separation structure. This yields compositional, descriptive and extensional reasoning principles for many features of low-level sequential computation: independence, ownership transfer, unstructured control flow, first-class code pointers and address arithmetic. We demonstrate how to specify and verify the implementation of a simple memory manager and, independently, its clients in this style. The work has been fully machine-checked within the Coq proof assistant.

This is, of course, related to TAL, PCC etc. If you find the deatils too much, I suggest reading the discussion (section 7) to get a feel for the possible advantages of this approach.

Programming Languages and Lambda Calculi

Programming Languages and Lambda Calculi looks like a comprehensive treatement of the semantics of typed and untyped call-by-value programming languages. I imagine if one had a basic undergraduate education in programming language theory and wanted to get up to speeed in a hurry this would be a great resource.

Securing the .NET Programming Model

Securing the .NET Programming Model. Andrew J. Kennedy.

The security of the .NET programming model is studied from the standpoint of fully abstract compilation of C#. A number of failures of full abstraction are identified, and fixes described. The most serious problems have recently been fixed for version 2.0 of the .NET Common Language Runtime.

This is highly amusing stuff, of course. Some choice quotes:

if source-language compilation is not fully abstract, then there exist contexts (think ‘attackers’) in the target language that can observably distinguish two program fragments not distinguishable by source contexts. Such abstraction holes can sometimes be turned into security holes: if the author of a library has reasoned about the behaviour of his code by considering only source-level contexts (i.e. other components written in the same source language), then it may be possible to construct a component in the target language which provokes unexpected and damaging behaviour.

One could argue that full abstraction is just a nicety; programmers don’t really reason about observations, program contexts, and all that, do they? Well, actually, I would like to argue that they do. At least, expert programmers...

"A C# programmer can reason about the security properties of component A by considering the behaviour of another component B written in C# that “attacks” A through its public API." -
This can only be achieved if compilation is fully abstract.

To see the six problems identified by thinking about full abstraction you'll have to go read the paper...

XML feed