## Types Considered Harmful

Benjamin C. Pierce's presentation slides (in PDF) for his talk on Types Considered Harmful. The talk starts out discussing some of the general advantages and disadvantages of static typing. But the aim of the talk centers on the problems of building a type checker for the Boomerang Programming Languague (an offshoot of harmony).

• Boomerang language design as an example of
1. the need for very precise types
2. some of the technical problems they raise
• Contracts as an attractive way of addressing some of these issues

Pierce's work is currently centered on creating a PL for Bidirectional Programming. A work in progress but it's interesting to see the thought process behind language design in real-time.

## Comment viewing options

### I think the core message of

I think the core message of the talk might be best summarized as "hard things are pretty darned hard when it comes to type systems, contract systems help make it more reasonable for everyone"

### I might phrase it as,

I might phrase it as, "contracts are easier to understand than types, and have an appropriate mapping to complicated, unreadable types (or runtime tests) via these combinators".

### thats not really viewing it

thats not really viewing it reasonably. Types are hard in the same since that proofs are hard if you're doing everything rigorously in a fixed logic.

Contracts are simply a mechanism for lessening the burden of proof in exchange for slightly weaker static guarantees

### Hmm, that sounds almost like

Hmm, that sounds almost like what I said. :-)

### what is the difference between contracts and types?

I always thought of (statically checked) types roughly as anything the compiler/interpreter automatically checks before compiling/running a program.

### Butchering the spirit of

Butchering the spirit of TAPL, and using a presumed focus on execution semantics, a type system is something that provides progress and preservation guarantees for any derivation (program trace) of a well-typed term.

I'm not sure of a concrete definition of contracts. I view them as invariants with a notion of blame for when the invariant is not satisfied. The origin is likely from Hoare logics (pre condition, statement, post condition). In practice, contracts are in the language of the program - such as prologue/epilogue assertions for functions, but that's not necessary. For example, simple contracts may be checked in a type system (and still statically find blame!) - see some of Phil Wadler's POPL 2008 work. While I haven't seen it, there might be merit to having a more expressive contract system than the actual program language - non-safety properties, like liveness or some non-interference ones, could not be expressed without doing so.

Putting these together.. type systems for languages are *mechanisms* for statically verifying particular properties of programs, while contracts are *declarations* of invariants whose violation can be attributed (to a term? computation tree?). Some contracts may be statically checkable with type systems (well, an augmented system: the system must be extended with a means to assign blame in cases of failure) - saying anything more may be a matter of religion :)

### Wadler and Findler

I think you mean Wadler and Findler's Scheme Workshop 2008 work. Or, better still, their ICFP 2008 submission, which is a revised, clearer presentation of their original work. In their system, they don't check that the contracts are "correct" or statically valid -- they check only for "compatibility" at simple types (i.e., without gross type errors, like casting <int <= string>). What I think you mean is that their "blame theorem" (proposition 10 and corollary 11) provides an intuition for when contracts (to them, "casts") produce blame: when casting (<S <= T>b e), if T <: S, then b will not be blamed. (Sort of. They cut it a little finer, splitting subtyping into positive and negative components.)

In spirit, if not in fact, the violation of a contract is attributed to a term, viz. Blume and McAllester's semantics, wherein contract violations produce (or at least imply the existence of) a counterexample. In the Wadler and Findler system, blame is attributed to a cast (assuming disjoint labels). In PLT Scheme, it's attributed to a module (last I checked -- I'm afraid my Scheme days are over, Leo).

As a slight alternative to your "putting these together": I think the key heterodoxy of contracts is its loss of static validity (compared with similarly expressive systems) in exchange for dynamic satisfiability.

Edited: to fix escaping of < and > in casts, drop unnecessary scare quotes.

### It says in the slides

The top slide on page 33 says,

• Mechanical checks of simple properties enormously improve software quality

Types
General but weak theorems (usually checked statically)
Contracts
General and strong theorems, checked dynamically for particular instances that occur during regular program operation
Unit tests
Specific and strong theorems, checked quasi-statically on particular "interesting instances"
• Needed: Better ways of integrating these different sorts of checks

### Puzzled by title

Why would someone as smart as Benjamin Pierce be so daft as to title a talk, about weaknesses of static type systems that need not be possessed by dynamic type systems, "Types considered harmful"? Was there a reason for not changing the "Types" into "Static typing"?

This puzzle aside, it's good to see this argument pulled together all in one place, although it is far from new.

Edit: remove unwanted doubling negation.

### Because everyone knows

that "dynamic typing" is an oxymoron? :)

Seriously--methinks Pierce was trying to be a bit provocative, just to prove a point. His audience was likely a group of people that are heavily into (static) type systems, and the bit about "screw it, use Scheme" was just a joke. (I'm all for relevant humor in presentations and the literature; and seeing as how type theory is founded on General Abstract Nonsense, a little bit of mathematical humor seems appropriate).

But still--it is good to re-emphasize the point that type systems, contracts/assertions, and tests are all just different means of establishing program invariants with different strengths and weaknesses. Might as well throw in formal methods as well...

### Yep

Pierce's work here serves (me) as a sharp reminder than in a multi-stage computing world, "static" and "dynamic" just aren't enough dimensions in which to have well-formed discourse.

### A thought experiment....

...though it wouldn't be surprising to find that a system such as I am about to describe, actually exists in practice. (Maybe this is an optimization of how existing bytecode verifiers could work)?

What would you think about a system that, when a user attempted to start an application (or perhaps a finer program granule than that), ran correctness analyses of the program in parallel with the running code--killing off the offending program (after it started) if something were to be found to be amiss? (In such a system, the program itself might be blocked pending successful verification were it to try something dangerous.)

Currently, program verification is either done prior to the program's execution (either much earlier, ie a static typechecker, or at program start time, ie a Java-esque bytecode verifier). The latter has long been a source of annoyance in Java-land, as verification of a nontrivial Java program can take a bit of time.

Other than the asynchronous nature of delivery of the failure notification, how does this scenario meaningfully differ from dynamic typechecking, or other forms of runtime assertion or contract verification? One possible advantage (and an occasional flaw of contracts and assertions) is that it would focus on buggy or ill-formed programs; many times, programmers tend to take the easy way out and use assertions/contracts to handle user errors. (This can take two forms--assertions based on user input, or failure to investigate user input to ensure it satisfies the necessary preconditions for some module).

Bringing up another point--one other area in which static and dynamic analyses methods can learn from each other is the topic of error recovery. Errors can be roughly categorized as follows:

* Program errors (bugs)
** Programs which are ill-formed (including those which attempt to violate the abstraction of the underlying machine)
** Programs which are well-formed but do not meet functional requirements
* Runtime errors outside the programmer's control
** Memory allocation errors,
** Network or hardware failures
* Input/user errors or temporary resource issues under the user's control
** User device not plugged in
** Mistyped URL in the browser's address bar, etc.

Keeping in mind that an input error at a lower level of abstraction may be a program error at a higher level. If I'm writing a division routine, I view a zero divisor as an input error (which is best propagated to the client); if I write a program that contains a division-by-zero, then this is my fault, and dumping core may be more appropriate.

One interesting property about type systems, excluding dependent types, is that they correlate strongly with the first category--program errors. Basic type systems are sufficient to preserve runtime integrity; more powerful type systems permit domain requirements to be encoded more easily. However, the "compile-time" nature of traditional type analyses inhibits their use for runtime error or user error detection; both depend on things not available to the compiler.

As we add things like dependent typing, and more and more program analyses occuring after deployment or during execution, we may find a need for type systems to provide alternate failure modes than refusal to translate the program, or termination of the program. If a dependent type depends on user input, and bogus input is provided by said user, it shouldn't result in the program halting with a dialog box saying "could not typecheck".

### The good news is this: the

The good news is this: the compile-time nature of static, dependent types can in fact force you to validate all your input! Why? Because the correct type for what you're going to do with it ends up being "prove it's well-formed", and if you've got a point up to which you've had more general input (that can't be reliably translated) then validation's the only way to do it.

Roll on unsafeValidateIO?

### Worse than dynamic type checking

What would you think about a system that, when a user attempted to start an application (or perhaps a finer program granule than that), ran correctness analyses of the program in parallel with the running code--killing off the offending program (after it started) if something were to be found to be amiss? (In such a system, the program itself might be blocked pending successful verification were it to try something dangerous.)

Alone such system would be worse than dynamic type checking, because it allow erroneous programs to run and do undefined behavior. If it inserts such checks (as in your proposal) then it behaves as dynamic type checking, perhaps only failing earlier because the concurrent analyzer proved the future execution paths to be erroneous, but consuming more resources. There's also questions of erroneous paths that aren't reachable (should the program terminate or not) but are proven as such by the analyzer. Overall I think such system would only be useful if it also specialized the program at runtime, otherwise it's only slightly better than dynamic type checking, but it's non-deterministic (depending on the input the analyzer may or may not catch a bug before execution reaches it).

As we add things like dependent typing, and more and more program analyses occuring after deployment or during execution, we may find a need for type systems to provide alternate failure modes than refusal to translate the program, or termination of the program. If a dependent type depends on user input, and bogus input is provided by said user, it shouldn't result in the program halting with a dialog box saying "could not typecheck"

A common approach is adding casts that try to match the value with the expected (computed) type, throwing an exception if unsuccessful. The program can either handle the exception or let it core dump, but it can only fail in known places. The compiler could even report back the places where it can fail due to casting (IIRC the Sage compiler did something like this) and you could decide what to do on a case by case scenario.

On a related note many proponents of dynamic type checking say that it's useful to run buggy programs if the erroneous parts aren't reachable by the particular execution path. In a sense the lack of this behavior is a failure (assuming such behavior is useful) of languages that don't allow you to run programs if static type checking fails, but there's nothing preventing a compiler flag that inserts a unconditional error signaling in the places that doesn't type check and let's the programmer to run the program anyway. If a programming language can be built with some form of dependent typing and this feature it would be provably better than dynamic type checking, because in the worst case scenario there's no loss of functionality, and the programmer can adjust it's correctness slider to the point he wants. Creating such a language is left as an exercise to the reader...

### Programs with bugs that never execute

On a related note many proponents of dynamic type checking say that it's useful to run buggy programs if the erroneous parts aren't reachable by the particular execution path. In a sense the lack of this behavior is a failure (assuming such behavior is useful) of languages that don't allow you to run programs if static type checking fails, but there's nothing preventing a compiler flag that inserts a unconditional error signaling in the places that doesn't type check and let's the programmer to run the program anyway. If a programming language can be built with some form of dependent typing and this feature it would be provably better than dynamic type checking, because in the worst case scenario there's no loss of functionality, and the programmer can adjust it's correctness slider to the point he wants. Creating such a language is left as an exercise to the reader...

An argument which is somewhat analogous to the statement that lazy evaluation is preferable to eager evaluation, as certain programs may not terminate under eager evaluation but do so under lazy evaluation.

Of course, getting back to a prior point, dynamically-typed languages do permit trapping of type errors. You generally cannot ignore them and execute ill-typed code that violates the underlying machine abstractions, but you can detect the type errors and do something else that makes sense--whether reflecting the error back to the user, adding a more useful diagnostic for an upstream error handler, or committing hari-kari.

One fly in the ointment is, of course, type erasure. Routines that know exactly the types of their argument(s) can elide any typechecks on the arguments, permitting faster code to be generated. Routines that don't, on the other hand, either have to include typechecks or be unsafe. Are there any languages out there which permit both "quick" and "safe" versions of the same routine to co-exist, selecting the appropriate one as necessary?

### Are there any languages out

Are there any languages out there which permit both "quick" and "safe" versions of the same routine to co-exist, selecting the appropriate one as necessary?

This is a fairly traditional application for specialisation and case/constructor style optimisations - language-level support's less common, it's generally more a matter of how much you trust your compiler to work out for itself. I know there're times I'd like to get more output about various compiler phases' "thinking" alongside the end result!

The good news there is that dependently-typed languages offer more opportunities for optimisation, not less - Edwin Brady's thesis has plenty to say there.

### Type erasure is our friend

Of course, getting back to a prior point, dynamically-typed languages do permit trapping of type errors. You generally cannot ignore them and execute ill-typed code that violates the underlying machine abstractions, but you can detect the type errors and do something else that makes sense--whether reflecting the error back to the user, adding a more useful diagnostic for an upstream error handler, or committing hari-kari.

Once the semantics specify that casts throw exceptions and the language offers support to exception handling then a statically checked language with casts would be as expressive as a dynamic language.

Are there any languages out there which permit both "quick" and "safe" versions of the same routine to co-exist, selecting the appropriate one as necessary?

There's no need for two versions. In languages with contract that blame correctly and in gradual typing you need to insert the cast before applying the function, so the compiler casts, remove the tags and execute the erased version. If we have a path that requires the correct types and a path that doesn't the compiler can either unify the types or place the cast in the most specific place:

foo x y = if x <= 0 then (x, y)
else (x, y+x)

In this example the type of foo can either be (a)int -> int -> (int, int) (assuming unification of both branches), (b)int -> ? -> (int, ?) (assuming using type dynamic (i.e. ?) and casting/injecting at the else branch) or (c)(forall a. [x:int | x <= 0] -> a -> (int, a)) /\ ([x:int | x > 0] -> int -> (int, int)) (assuming a language with dependent and intersection types). On cases (a) and (c) the casts are inserted when the function is applied, in case (b) the function may not erase the type of y. Perhaps case (c) can be used in the way you wanted: each branches of the intersection type makes different assumptions about the values (and shape) of the parameters, so the compiler could create two different specializations and use aggressive optimization/specialization/jitting to branch on the function. This strategy is quite sophisticated, because it can easily lead to code bloat, so the compiler should be careful when doing it, but not much more sophisticated (modulo type system) than current state of art JIT compilers.

### There's still the issue of type inference and blame

Java and C++ today both contain trappable downcasts, and have for over a decade. But you gotta annotate dang near everything in those, and getting back to Pierce's talk, when you add type inference to the mix, who do you blame when a type unification fails, and there's considerable distance between the constructor of the datatype in question (where its type is set), and the consumer which doesn't like it?

I guess the sweet spot might be a language with:

* Type inference by default
* Contract annotations (including explicit type annotations, or annotations of arbitrary predicates including things more powerful than the type system can prove; obviously the latter will need to be checked at runtime. Static-typing fanatics shouldn't complain, your favorite languages have had guard clauses for years now...)
* A sound system for handling failures at any level (including exception-handling, though for some failures other more local means of dealing with failure are more appropriate)
* Pre-deployment diagnostics for when a program can never work (nowadays called "compiler error messages", though as indicated above it might be useful to downgrade these to really strong warnings if the compiler is able to emit a well-formed but likely-semantically-invalid-in-part program)
* Oh, and some way of specifying unit tests in the same place as the rest of the code.

### you add type inference to

you add type inference to the mix, who do you blame when a type unification fails

That's a good question. It can indeed get tricky assuming a chain of polymorphic functions in between. My naive answer: you have to state not only the end of chain where the unification failed, but also the function that initiated the polymorphic call. In other words, the polymorphic functions are clearly not to blame, so show the monomorphic functions on each end. Something like, "x of type T1 is provided by f, but x of type T2 required by g".

Is there something wrong with my naive approach?

### Not parametric polymorphism

That's a good question. It can indeed get tricky assuming a chain of polymorphic functions in between. My naive answer: you have to state not only the end of chain where the unification failed, but also the function that initiated the polymorphic call. In other words, the polymorphic functions are clearly not to blame, so show the monomorphic functions on each end. Something like, "x of type T1 is provided by f, but x of type T2 required by g". Is there something wrong with my naive approach?

With complex type systems there may be intermediate steps that add erroneous (but valid in the context) inference. If the system propagates contracts one parameter used in the wrong place may result in the unification of some additional contracts that eventually contradict one expectation later in the chain. Add some form of ad-hoc polymorphism to the mix and hilarity ensues. OTOH if the usual programmer practice calls for adding explicit types and contracts as documentation when things get complicated than the pathological cases are vastly improbable.

IMO it's clear that today's compiler are broken when it comes to good error messages, particularly when the types become too interesting. Perhaps we need a way to extend the compiler when it deals with certain types and the creator of the type can craft a better error message, or something similar.

### It wants generalising a

It wants generalising a little - the full version is to find the minimal set of constraints that makes the typing problem unsolveable and throw up the locations they came from. This catches more complicated situations where there's eg a 3-way conflict, and in cases where one location disagrees with two others it can help suggest which type's correct.

### Democratic type systems

It wants generalising a little - the full version is to find the minimal set of constraints that makes the typing problem unsolveable and throw up the locations they came from. This catches more complicated situations where there's eg a 3-way conflict, and in cases where one location disagrees with two others it can help suggest which type's correct.

Now we can have democratic type systems, if three locations say it's type A the fourth location should accept and add a cast ;)

Minority Cast?

### Pierce's view of typing

The likely rationale is Benjamin Pierce's view of typing.

From the preface to "Types and programming languages":

The word "static" is sometimes added explicitlyâ€”we speak of a "statically typed programming language," for exampleâ€”to distinguish the sorts of compile-time analyses we are considering here from the dynamic or latent typing found in languages such as Scheme (Sussman and Steele, 1975; Kelsey, Clinger, and Rees, 1998; Dybvig, 1996), where run-time type tags are used to distinguish different kinds of structures in the heap. Terms like "dynamically typed" are arguably misnomers and should probably be replaced by "dynamically checked," but the usage is standard.

In other words (and obvious from context in the preview), he views true types as being statically checked.

Eivind.

### A more nuanced take on the distinction...

...arises from the observation that dynamically typed systems, many of 'em anyway, don't actually determine the "type" of a term. Instead, they simply determine whether or not a given operation on a given object is permissible; instead of asking if something is a duck, they ask "can it quack"?

I put "type" in quotes, because it is important to remember that the notion of type is a function of the type system in question, rather than a language-independent property. In languages with subsumption (i.e. OO languages), an object may belong to more than one type (though in general, there will be least class to which the object belongs). In some languages, the notion of "type" is an artifact of construction. In languages with predicate type systems, there are subtypes of classes to consider. In prototype languages, it gets really complicated.