Extending HM type inference -- would this be possible? Or even desirable?

I wonder whether the following extension to the type inference mechanisms of languages like Haskell, ML, etc. would be possible, and, if so, desirable.

Today we write functions like

head (a:_) = a

and get warning messages concerning patterns not covering all possibilities. This is fine. However, I am not being warned when the compiler sees code like

head []

(even when head were defined like above, without a default clause that raises a runtime error.)

In principle, though, the type inference could collect for each value of algebraic datatype by which constructor this value could possibly have been created.

For example, the type of
unJust (Just x) = x
would not be (Maybe a -> a), but something like
(Maybe{Just} a) -> a
Consequently, the type of Nothing would be (Maybe{Nothing} a) and the type checker could flag occurences of
unJust x where (x::Maybe{Nothing} a)
or
head y where (y::[a]{[]})
as type error.

The consequence would be that it would be no longer worth a warning when writing functions with incomplete patterns. Rather, at the place where the function is used the compiler would be able to diagnose one of three outcomes:
- this will fail (when the intersection of the constructor sets is empty)
- it may succeed (when the intersection is not empty)
- it will succeed as far as pattern matching is concerned (when the constructor set of the type of the argument is a subset of the constructor set of the function argument type)

Case 1 could be treated like just another type error. Case 2 could provoke a warning. Case 3 is ok, of course.

To be sure, that would also encourage a different style. Instead of writing dozens of meanigless default cases like
foo _ = error "can't happen"
one could leave that out and let the compiler prove that indeed it can't happen.

(I hope I could express my thoughts clear enough despite english not being my native language.)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

You could be writing ...

You could be writing the abstract to Conor McBride's introduction to Epigram.

Totality Checker for Haskell

Catch: Case Totality Checker for Haskell by Mitchell and Runciman. Intro from the linked webpage:

A Haskell program may fail at runtime with a pattern-match error if the program has any incomplete (non-exhaustive) patterns in definitions or case alternatives. This paper describes a static checker that allows non-exhaustive patterns to exist, yet ensures that a pattern-match error does not occur. It describes a constraint language that can be used to reason about pattern matches, along with mechanisms to propagate these constraints between program components. The checker can handle some higher-order functions.

Refinement

The refinement type systems by Pfenning et al. (Refinement types for ML) capture such information about constructors in an HM context.

Looks like this will be a promising weekend

Thank you, Peter, Kalani and Matthieu for the pointers to the papers. Looks like I'll spend some hours this weekend reading this interesting stuff.

Polymorphic Variants

You might also check out OCaml's polymorphic variants. These support several of the features you describe:

let head = function `Cons(x,tl) -> x
(* the type of head is [< `Cons of 'a * 'b] -> 'a
* meaning that it accepts a union that has at most (<) one constructor called Cons *)

let maybe_head = function
`Empty -> ...
`Cons _ as non_empty -> ... (head non_empty)

(* The [as] binding here does the refinement required in order for the
* call to head to typecheck *)

let _ = head `Empty (* Won't compile *)

These can be extremely useful for defining very precise types so that you simply can't create a malformed value (like a non-empty list in your case). However, the drawback is that the inference error messages can be terrifyingly long, even for a simple mistake like a spelling error (type annotations help a lot to reduce this though).

Statically assuring the absence of head [] errors

Ingo Wechsung wrote

However, I am not being warned when the compiler sees code like
head []

One can statically ensure the absence of errors like taking head or
tail of an empty list -- using existing ML or Haskell98
compilers
. The only extension to HM inference is abstract data
types (often in the form of a module system, however primitive) --
which is universally implemented already. This approach is indeed so
simple and so old one wonders why it is not used more widely.

http://www.haskell.org/haskellwiki/Non-empty_list

Moreover, using the existing OCaml or Haskell system, one can even
statically ensure the absence of errors like indexing past bounds of a
dynamically allocated array (aka buffer overflow errors). The
lightweight dependent typing page talks more about what is already
possible:

http://pobox.com/~oleg/ftp/Computation/lightweight-dependent-typing.html

Statically assuring the absence of head [] errors

That's exactly what I wanted to avoid:


> This approach is indeed so
> simple and so old one wonders why it is not used more widely.
> http://www.haskell.org/haskellwiki/Non-empty_list

Well, the diagnosis is correct when the author says that head errors occur because the type of head is not total but we don't care. To which I add: and also the compiler (more concretely the type checker) does not care although it could.

I don't doubt that there are techniques to write programs that will not go wrong in Haskell, likewise in Java, even in assembly language. My concern is how easy it is to write and compile (without warnings) code that *will* go wrong. And I think it's just too easy to write something like that:

foo = let Just a = Nothing in a+1

The other issue is that incomplete pattern match warnings become much less useful when the subtype is not taken into account by the compiler. For example, I have a function f where I know that it returns a cons, but nevertheless, the compiler warns on every

let hd:tl = f ...

There's just no way to distinguish false warnings from useful ones.

(The lightwight dependent typing stuff is a different matter, AFAICS)

Instead of enriching the expressiveness of types ...

GADTs and dependent types are seducing since they enable some subtle invariants to be available for every values of a given type. Yet, in such systems, type checking and type inference is more complicated than usual HM type inference. An alternative is to keep the ML type syntax and to add logic assertions into the source code following a Hoare logic style. For instance, have a look at Greg Morissett's HTT or our paper :

Extending Static Checking of Call By Value Functional Programs

In this system, you can simply write :

let head l where l <> Nil =
match l with x :: _ -> x

and a first order theorem prover will show that the branch where l = Nil is impossible according to the precondition.

You can even enrich the specification by adding a postcondition :

let head l where l <> Nil returns x where exists xs. l = x :: xs =
match l with x :: _ -> x

The type of "l" is 'a list, there is no type index to inform the type-checker of the length of the list. I think this is a more modular
approach because sometimes your invariant does not rely on the length but on the elements of the list.

For instance, if your list is sorted, you can write :

let min l where sorted l returns x where forall y. elements (y, l) -> x match l with x :: _ -> x

Moreover, when a property cannot be proven automatically, a proof obligation can be handled manually using a proof assistant like Coq.

My opinion is that we should use types to automatically prove safety properties (and ML has shown itself to be well-adapted to that purpose) and we should use standard Hoare logic to prove correctness properties using external provers.