How OCaml type checker works -- or what polymorphism and garbage collection have in common

There is more to Hindley-Milner type inference than the Algorithm W. In 1988, Didier RÃ©my was looking to speed up the type inference in Caml and discovered an elegant method of type generalization. Not only it is fast, avoiding the scan of the type environment. It smoothly extends to catching of locally-declared types about to escape, to type-checking of universals and existentials, and to implementing MLF.

Alas, both the algorithm and its implementation in the OCaml type checker are little known and little documented. This page is to explain and popularize RÃ©my's algorithm, and to decipher a part of the OCaml type checker. The page also aims to preserve the history of RÃ©my's algorithm.

The attraction of the algorithm is its insight into type generalization as dependency tracking -- the same sort of tracking used in automated memory management such as regions and generational garbage collection. Generalization can be viewed as finding dominators in the type-annotated abstract syntax tree with edges for shared types. Fluet and Morrisett's type system for regions and MetaOCaml environment classifiers use the generalization of a type variable as a criterion of region containment. Uncannily, RÃ©my's algorithm views the region containment as a test if a type variable is generalizable.

As usual with Oleg, there's a lot going on here. Personally, I see parallels with "lambda with letrec" and "call-by-push-value," although making the connection with the latter takes some squinting through some of Levy's work other than his CBPV thesis. Study this to understand OCaml type inference and/or MLF, or for insights into region typing, or, as the title suggests, for suggestive analogies between polymorphism and garbage collection.

A great document

I just deleted my corresponding forum entry. Below are the ramblings of mine that were replaced by Paul's in his submission.

I think there are interesting reasons why principal type inference is not as popular today as it was in the past decades (for example RÃ©my's further work on MLF has been mostly ignored by the research community at large, with the notable exception of Daan Leijen) : the type languages that we are now studying are so far past the original ML "sweet spot" between expressivity and simplicity that nobody expects principal types anymore. People are convinced from the start that type inference will be an unsolvable mess for their new language, and tend to go for local, bidirectional approaches that are simpler to develop but can have less robust properties.

There is also the issue of what happens when you move to dependent type theories. There has been interesting work on type inference in dependent settings (eg. Andreas Abel and Brigitte Pientka in 2011, or Conor McBride and Adam Gundry in 2012), but I'm not sure how we could distinguish the different layers of the problem, and keep robust principality guarantees for the ML-ish or System-F-ish portions of programs.

By gasche at Sun, 2013-03-10 16:42 | login or register to post comments

Lambda the Ultimate

User login

Navigation

How OCaml type checker works -- or what polymorphism and garbage collection have in common

Comment viewing options

A great document

Browse archives

Active forum topics

New forum topics

Recent comments