Gradual Typing for Functional Languages

Gradual Typing for Functional Languages

Static and dynamic type systems have well-known strengths and weaknesses, and each is better suited for different programming tasks. There have been many efforts to integrate static and dynamic typing and thereby combine the benefits of both typing disciplines in the same language. The flexibility of static typing can be improved by adding a type Dynamic and a typecase form. The safety and performance of dynamic typing can be improved by adding optional type annotations or by performing type inference (as in soft typing). However, there has been little formal work on type systems that allow a programmer-controlled migration between dynamic and static typing. Thatte proposed Quasi-Static Typing, but it does not statically catch all type errors in completely annotated programs. Anderson and Drossopoulou defined a nominal type system for an object-oriented language with optional type annotations. However, developing a sound, gradual type system for functional languages with structural types is an open problem.

In this paper we present a solution based on the intuition that the structure of a type may be partially known/unknown at compile-time and the job of the type system is to catch incompatibilities between the known parts of types. We define the static and dynamic semantics of a λ-calculus with optional type annotations and we prove that its type system is sound with respect to the simply-typed λ-calculus for fully-annotated terms. We prove that this calculus is type safe and that the cost of dynamism is “pay-as-you-go”.

In other news, the Holy Grail has been found. Film at 11.

This piece of genius is the combined work of Jeremy Siek, of Boost fame, and Walid Taha, of MetaOCaml fame. The formalization of their work in Isabelle/Isar can be found here.

I found this while tracking down the relocated Concoqtion paper. In that process, I also found Jeremy Siek's other new papers, including his "Semantic Analysis of C++ Templates" and "Concepts: Linguistic Support for Generic Programming in C++." Just visit Siek's home page and read all of his new papers, each of which is worth a story here.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Hmm

I'm not convinced. Inference often alleviates the need for explicit annotations, and the type system then only bothers you when you try to do something that would fail at runtime under dynamic checking anyway.

Does anyone find themselves using Haskell's Dynamic type often?

More Work To Do

First, if your point is that the usefulness of "Dynamic" might be overrated, I completely agree.

But I think the bigger point here is that they've defined a calculus that's been proven sound with respect to the simply-typed lambda calculus. There's still a ton more work to do. As they state in their conclusion:

As future work, we intend to investigate the interaction between our gradual type system and types such as lists, arrays, algebraic data types, and implicit coercions between types, such as the types in Scheme’s numerical tower. We also plan to investigate the interaction between gradual typing and parametric polymorphism [16, 34] and Hindley-Milner inference [9, 21, 27]. We have implemented and tested an interpreter for the λ?→ calculus. As future work we intend to incorporate gradual typing as presented here into a mainstream dynamically typed programming language and perform studies to evaluate whether gradual typing can benefit programmer productivity.

In other words, supporting all of the stuff we expect from a modern language is yet to come. Nevertheless, this is (AFAIK) the first formalization of a total "slider" from the untyped lambda calculus to the simply-typed lambda calculus ever performed, making it a crucial step in the formalization of future (obviously not conservative) extensions.

While I don't share their

While I don't share their enthusiasm for adding dynamic typing support directly to a type system, I can now see the wider importance.

Where do you this coming in useful? The main two applications I can see are runtime generation of code as in MetaOCaml and distribution/dynamic loading as in Acute.

The Value for Me...

...crucially lies in "However, there has been little formal work on type systems that allow a programmer-controlled migration between dynamic and static typing." All I ever wanted (besides my two front teeth, of course) was control over what gets proved at compile time and what I'm willing to let slide until runtime. That's it. It'll be unfortunate if the only way to get that level of control is through annotations, though.

It probably has to be in

It probably has to be in some sense, unfortunately. I don't know that's so horrible so long as they're only needed at the call-site rather than the callee - the rest comes down to elegant ways to block-annotate, especially if we can retain type inference in sections that're annotated with a request to use it (or as a possible default).

Link to citation on LtU

Interested readers might also like to read about my group's cited project, Sage, which appeared on LtU a little while ago. Sage has a type Dynamic mixed in with other types, and applies the machinery of Hybrid Type Checking to insert the necessary casts.

Sage and catching errors statically

Hi Ken. I've been reading about Sage... cool language! One question that I couldn't find the answer to is the following: if a Sage program does not mention the type Dynamic (it is fully annotated), does the type system guarantee that all type errors (ignoring refinement types for now) will be caught statically and that no casts will be inserted? When a language has subtyping and a subsumption rule, and implicit downcasts, this property is tricky to achieve. For example, Thatte's Quasi-Static Typing does not have it.

I also have one detailed question about the type system of Sage. Looking at the Scheme Workshop 2006 paper, Figure 6 and 7 define the type system for the source language. In Figure 6 I see a subsumption rule, and in Figure 7 I see the usual T subtype Dynamic rule, so Sage has implicit upcasts, but I do not see a rule that allows for implicit downcasts in the source language. Am I missing something?

Type checking is still undecidable

if a Sage program does not mention the type Dynamic (it is fully annotated), does the type system guarantee that all type errors (ignoring refinement types for now) will be caught statically and that no casts will be inserted?

Nope. Even without Dynamic or refinements, we have arbitrary computations on types, so type checking is undecidable even in a fully annotated program. For example if the computation of the format string for printf takes too long, a cast will be inserted.

I do not see a rule that allows for implicit downcasts in the source language.

It seems our presentation of Dynamic is a bit confusing. There are no implicit casts in the type system, so Dynamic is the same as a Top type. In your paper you say "It may be that the given type system was intended to characterize the output of compilation" which is an interesting interpretation. I think of the type system as specifying those programs that could be statically typed given a perfect theorem proving oracle.

So we know that Dynamic is not a subtype of Int, but in the algorithmic presentation we say "maybe" to induce the cast we want. And since compilation outputs a well-typed program, the usual soundness then applies.

I have another minor quibble from your paper:

(though it is missing a rule for cast)

A cast in Sage is just an application of the constant "Cast" which has type "T:* -> Dynamic -> T" so it doesn't require a type rule.

We have to take the blame for this confusion as well; implementing every feature as a constant with fancy semantics is a decision that was fun at the time, and made the implementation cute, but probably causes too much confusion (and you have to prove things the same anyhow).

So the declarative specification is missing?

If Figure 6 and 7 define a type system for an intermediate language, and Figure 8 and 9 give an algorithm for type checking Sage, where is the type system for Sage itself? Where's the declarative specification that defines the static semantics of the language? Or do you mean for the algorithm to define the language? That would be somewhat unusual.

And if the type computations terminate?

Even without Dynamic or refinements, we have arbitrary computations on types, so type checking is undecidable even in a fully annotated program.

Suppose the program has only terminating type computations. Can you then provide the static type safety guarantee? Basically, this boils down to what kinds of type errors, if any, do you guarantee to catch at compile time?

The subtyping algorithm is pluggable

Basically, this boils down to what kinds of type errors, if any, do you guarantee to catch at compile time?

The best answer is from Section 5.1, where we explain the requirements on a decider for subtyping: "Clearly a naive subtyping algorithm could always return the result '?' and thus trivially satisfy these requirements, but more precise results enable Sage to verify more properties and to detect more errors at compile time"

The abstraction implies that subtyping may invoke any finite number of decision procedures. So any decidable error may be detected, but no infinite family of errors (the halting problem).

Simplify

Kenneth, thanks for joining us here. I'm interested in exploring Sage, but unfortunately I haven't been successful in installing Modula-3 and building Simplify. Are there plans to generalize Sage's use of a theorem prover so that a different one could be substituted, or at least to move to a different one—perhaps Coq or MetaPRL, since both are also written in O'Caml?

Development doubtful

There aren't really any plans to develop Sage any further; a better next step is to experiment in a more mainstream language (OCaml? GHC?) to scale up the size and number of programs we are analyzing. In that implementation effort, modularity with respect to the theorem prover would be important (it is a key point of our theoretical work).

By the way, I think there are a couple of Simplify executables in the Sage distribution. If those aren't for your architecture, you can run Sage with -nosimplify and it will say "Maybe" to every judgement that requires Simplify, or maybe we can come up with another build.

Walid rocks

Walid Taha rocks. I can see him from my office :)

redefining type safety

There is a deep and exciting redefinition at the core of this work: the change from subsumption to consistency. Previous type checking work tries to prove that whenever you use an A in a position that expects a B, that all A's are also B's. This paper relaxes that, saying that A's and B's merely have to be "consistent" with each other.

Philosophically, this redefinition gets back to the heart of where I think of static typing as coming from: you want to prevent code that is completely broken, but otherwise let the program run. From such a point of view, the usual "all A'as are B's" subsumption approach is overly paranoid.

Additionally, it is very interesting that simply typed lambda calculus can be fully captured with this "consistency" definition. It makes me wonder if other, existing subtyping-based languages might should explore consistency in their type systems instead of subsumption?

Certainly gradual typing is something to look at for existing dynamically typed languages. You can add it to the compiler without affecting the language itself....

more concoqtion!?

concoqtion is pretty interesting to me. Did you come across more than just the paper to download on the project?

Yep!

Follow the Concoqtion link above, then follow the link in the first Update. :-)