Dependent types: literature, implementations and limitations ?

In my eternal quest for provably safe concurrent and distributed resource management, I'm trying to reach an understanding of dependent types. I have read a bit about them, even toyed a little bit with them in Twelf, but I still don't quite *get* them.

  • Really, what *are* dependent types ? What do we expect to do with them ? Where exactly does the domain stop ?
  • Have DML, Cayenne, Epigram been tested with pragmatic examples, beyond simple lists manipulation ? -- I don't include Twelf or Coq in the list as the focus of these languages is quite different.
  • What are the limitations of dependent types ? Is typechecking even decidable ?
  • How much of dependent typing can be encoded with non-dependent types ?
  • Are there works on dynamic counterparts of dependent types ?
  • What papers/tutorials must one absolutely read on the subject ?

By the way, the Wikipedia article on Dependent Types is nearly empty. Is there anyone interested in completing it ?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Just to make it really easy.

You can start editing right away.

Page 462 of Types and Programming Languages

See page 462 of Types and Programming Languages:

families of types indexed by terms

Any term ?

Thanks for the reference.

Does this mean I should be able to index a type by, say, all terms of type

'a list n -> 'a list (n+1)

? What would this mean ? How would this interact with modules ? Does this mean that C++ has (unsafe) dependent types ?

Dependent types == first-order logic

1. Start with how the Curry-Howard isomorphism matches up types in ML and Haskell, with propositional logic:
product types   == conjunction
sum types       == disjunction
function types  == implication
unit type       == true
void type       == false

In ML and Haskell, type polymorphism is (almost) second-order polymorphism: you can write universally quantified formulas, and the domain of quantification are the set of predicates.

You might ask: what about first-order polymorphism? Like, suppose I have a proposition that for all integers n, some property P(n) holds. Since n does not range over types, it can't be encoded as a type in ML or Haskell. Instead n ranges over each individual integer, and P(n) is a formula indexed by integers. That is, our quantifier is ranging over individuals, and not entire predicates.

Dependent types are the Curry-Howard interpretation of first-order logic. First, note that we represent individual values in a functional programming language using programming language expressions. So integers can be represented using expressions like 1, 2, 4+6 and so on.

Second, note that CH says that propositions correspond to types. Since a a proposition in first-order logic can have individuals appear in it, our corresponding type must have whatever corresponds to individuals in our programming language appear in the type. Since individual values are program expressions, this means that expressions must be able to appear in types.

This is where the name "dependent types" comes from: a type expression can "depend on" a program expression, since a program expression can appear in a type.

3. The exciting thing about them is that once you have a programming language with dependent types, you can express, as a type constraint, any program specification that can be written in first-order logic as a typechecking constraint.

What makes typechecking dependently-typed programs difficult is the problem of type equality. Since expressions can appear in types, deciding whether two types are equal (for instance, to decide whether the argument to a function is well-typed) requires deciding whether program expressions are equal. This is quite difficult, and if the programming language permits an unrestricted general recursion operator then it becomes undecidable. If you confine nontermination and other side-effects to a monad or the like, then typechecking can remain decidable.

4. Pure first and second order logic are strictly incomparable; there are things you can say in each that the other cannot express. A language whose type system is a full higher order logic (like Coq) can encode (roughly speaking) any constraint you can express in constructive set theory.

Food for thought

 void type       == false
Although I understand the concept, I don't see any void type in ML. Or would something like unit -> 'a be the void type ? (edit: now, I understand better)


Since n does not range over types, it can't be encoded as a type in ML or Haskell.
Can't it ? I haven't really tried, but I have the impression that it is possible to encode unary integers and Peano arithmetic in the type system.


If you confine nontermination and other side-effects to a monad or the like, then typechecking can remain decidable.
Interesting. Do you have any references for this ?

Void type

My understanding of void type is that expressions of this type never yield any value (unlike void functions in C!), so its context is in effect an unreachable code.

This type is useful in describing various control primitives of abortive nature, like call/cc.

My bad

My understanding was that of a type without inhabitants, such as

type t;;

in OCaml.

Your understanding is right

What is typically meant, in type theory, by the void type is the uninhabited type. Some caveats: most Turing-complete languages lift all types (with _|_) so that the void type is inhabited by the non-terminating expression. (The type system, viewed as a logic, is inconsistent; not to be confused with unsound!). Its type in the polymorphic LC is ∀α.α. As Andris Birkmanis mentioned, void in C is not the same void, rather it's the (lifted) unit type (true, (), unit, or ∀α.α→α).

Elaborating on Derek's comments...

In ML or Haskell, an empty type doesn't have any values as inhabitants, even though there might be expressions of that type. Consider:
datatype void = Void of void

Here, there's no way to actually construct a value of type void. However, we can write a function with a return type of void:

- fun makev () = Void(makev());

val makev : unit -> void

If you try to call makev, you get an infinite loop, so you can never actually get ahold of a value of type void.

Void and void * in C

void functions in C indeed do not return any value. Are you confusing void with void *? The latter type is not a pointer to one of the non-existent values of type void, but rather a pointer to an unspecified (but non-function) type.

C's void is unit in most cases

"void" in C, when used as a return type of functions, most often corresponds to "unit" type - a type with a single inhabitant (and under CHI1, corresponds to "true" - meaning it's not very informative, except it suggests termination of the function by way of yielding a trivial value). In some cases, though, it is used as indeed "void" type - a type without inhabitants ("false" under CHI1) - longjmp comes to mind. In these cases, "void" means a guarantee of non-termination (or rather non-yield of any value, as there is no possible value of this type).

In argument position, C's "void" means (always?) a "unit" type.

"void*" IMHO has nothing to do with "void" and is just an example of disgusting overloading of keywords.


1CHI here means Curry-Howard isomorphism.

Dependent type encodings in Haskell or ML

neelk wrote:
Like, suppose I have a proposition that for all integers n, some property P(n) holds. Since n does not range over types, it can't be encoded as a type in ML or Haskell.
Of course it can be, given the suitable meaning of ``encoding''. The simplest example is an abstract data type. For example, in OCaml
module S1 = 
  Set.Make(struct type t = int
                  let compare x y = if x = y then 0 
                                    else if x > y then -1 else 1 end);;
gives me
module S1 :
  sig
    type elt = int
    type t
    val empty : t
    val is_empty : t -> bool
    val mem : elt -> t -> bool
    val add : elt -> t -> t
    val singleton : elt -> t
...
end
where the type of the set, Set.t, is abstract. The only way to obtain values of that type is to use the members of the module S1. That is, whatever invariant is enforced by the code of module S1, remains in force throughout the program. In particular, if S1 enforces the invariant that
	P(n) == S1.mem n (S.singleton n) = true
or that
	Q(n) == forall m \in int. m /= n -> S.mem m (S.singleton n) = false
these invariant hold all throughout, for all integers. The abstract type S.t represents, at the level of types, the above properties of integers. The static property Q(n) is actually quite strong and cannot be efficiently represented by a run-time contract, due to the "forall m \in int" part.

Of course the invariant actually enforced by an implementation of the data type may differ from the invariant we think it enforces. Thus an implementation becomes a ``trusted kernel'', which has to be verified, using formal tools if necessary. It goes without saying that verifying an implementation of the Set interface is far easier than verifying the whole program that uses Set.

The same approach -- using abstract types (existentials) to encode propositions on values -- underlies many other `light-weight' encodings of dependent types, for example http://lambda-the-ultimate.org/node/view/161

It should be pointed out that guarantees of abstract data types come from the lack of reflection, extension, and introspection of abstract data types. OCaml, btw, is weaker in that respect because polymorphic equality can break some of the invariants. So, reflection, extension and introspection facilities of a language can be viewed as vices rather than virtues.

So, reflection, extension and

So, reflection, extension and introspection facilities of a language can be viewed as vices rather than virtues.

Indeed. I call them "abstraction breaking facilities". The point being that programming involves both the creation of abstractions and abstraction breaking. The trick is to find semantically rich constructs for abstraction breaking. Reflection is too powerful and unconstrained for this purpose.

Literature

I'd say the characteristic feature of a dependent type system is that it allows a type to be predicated on a value. DML, Cayenne and Epigram do this in variously different ways. Epigram and Cayenne are "full-spectrum" in that the expression language for types and values is the same.

It's always hard to suggest good introductory papers, though. I learned from the introduction to Hongwei Xi's thesis, Lennart Augustsson's Cayenne papers, and playing around with various implementations (including Coq). Now, you could look at some of the Epigram papers, in particular the Epigram tutorial and Why Dependent Types Matter, and play with the prototype implementation.

As for "pragmatic examples", there are a few programs around which go beyond list manipulation; e.g. the typechecker in "The View From The Left". There are a few others in the Epigram distribution. But more examples of real programs are certainly needed...

(I admit to not really knowing what a "real" program is however :)

There is basicaly no problems

There is basicaly no problems of implementing dynamic dependendent typing, therefor there is not much to write on this topic. The problem with dependent types is to garantee them staticly, if you drop this constraint there is no problems. All contracts system I have seen allow things that would need dependent typing to do staticly.

But you could also take a look at this: Dynamic Typing with Dependent Types