GADT vs. Type Classes

I'm looking for a paper or tutorial which compares/contrasts generalized algebraic data types and type classes (as found in GHC/Haskell). I get the feeling that they are some how duals of each other, and that there may be some overlap of the capabilities of each, but I don't know if I understand the essential differences which makes both desirable. For example, the standard GADT example of a tagless interpreter looks pretty similar to an equivalent built using type classes. In fact, the code for "eval" is identical, only the type annotations differ. Any thoughts, comments?
data Term a where
    Lit  :: Int -> Term Int
    Inc  :: Term Int -> Term Int
    IsZ  :: Term Int -> Term Bool
    If   :: Term Bool -> Term a -> Term a -> Term a
    Pair :: Term a -> Term b -> Term (a,b)
    Fst  :: Term (a,b) -> Term a
    Snd  :: Term (a,b) -> Term b 
  
eval :: Term a -> a
eval (Lit i)    = i
eval (Inc t)    = eval t + 1
eval (IsZ t)    = eval t == 0
eval (If b t e) = if eval b then eval t else eval e
eval (Pair a b) = (eval a, eval b)
eval (Fst t)    = fst (eval t)
eval (Snd t)    = snd (eval t)
data Inc a    = Inc a    
data Lit a    = Lit a   
data IsZ a    = IsZ a   
data Pair a b = Pair a b
data If p t e = If p t e
data Fst a    = Fst a   
data Snd a    = Snd a  

class    Eval a b | a -> b                      where eval :: a -> b
instance Eval (Lit a) a                         where eval (Lit i)    = i
instance (Num b, Eval a b) => Eval (Inc a) b    where eval (Inc t)    = eval t + 1 
instance (Num b, Eval a b) => Eval (IsZ a) Bool where eval (IsZ t)    = eval t == 0
instance (Eval a Bool, Eval b d, Eval c d) => 
          Eval (If a b c) d                     where eval (If p t e) = if eval p 
                                                                         then eval t
                                                                         else eval e
instance (Eval a c, Eval b d) =>    
          Eval (Pair a b) (c,d)                 where eval (Pair a b) = (eval a, eval b)
instance (Eval a (b,c)) => Eval (Fst a) b       where eval (Fst t)    = fst (eval t)
instance (Eval a (b,c)) => Eval (Snd a) c       where eval (Snd t)    = snd (eval t)
...and here's a driver program which will work for either snippet...
{-# OPTIONS -fglasgow-exts #-}  
{-# OPTIONS -fallow-undecidable-instances #-}  

main = print $ (eval stuff)

stuff = Pair (If (IsZ (Inc (Inc (Lit 5))))
                 (Lit 6)
                 (Lit 7))
             (Fst (Snd (Pair (Lit 42)
                             (Pair (Lit 8) (Lit 9)))))

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

GADTization

And if there are any GADT experts out there who are looking for a challenge, you might try to use GADTs in the following data structure to eliminate the existential types (which make writing functions like "tail" problematic). The "Fancy" data structure is statically constrained to contain linearly increasing runs of elements of alternating types. So for example, you can start out with one Int and then two Strings, then three Ints, four Strings, etc.
 data Fancy a b left next =
     forall c d left' next' lz decleft .
     (Show c, Show d,
      Zerop left lz,
      If lz (Succ next) next next',
      Pre left decleft,
      If lz next decleft left',
      If lz b a c,
      If lz a b d
     ) => Cons a (Fancy c d left' next') | Nil

 fancy :: Fancy Integer String Zero (Succ Zero)
 fancy = (Cons 1
         (Cons "a" (Cons "b"
         (Cons  2  (Cons  3  (Cons 4
         (Cons "c" (Cons "d" (Cons "e" (Cons "f"
         (Cons  5  (Cons  6  (Cons  7  (Cons  8  (Cons 9 Nil)))))))))))))))

 instance (Show a, Show b) => Show (Fancy a b c d) where
     show (Cons x xs) = "(Cons " ++ (show x) ++ " " ++ (show xs) ++ ")"
     show Nil = "Nil"

 data Succ a = S a deriving Show
 data Zero = Z deriving Show

 data TTrue
 data TFalse

 class Pre a b | a -> b
 instance Pre (Succ a) a
 instance Pre Zero Zero

 class If p t e result | p t e -> result
 instance If TTrue  a b a
 instance If TFalse a b b

 class Zerop a b | a -> b
 instance Zerop Zero TTrue
 instance Zerop (Succ a) TFalse

Detailedness of the types

I am not competent enough to really characterize the difference, but I guess one important difference in your example is that the type class solution requires "deep" types, i.e., the structure of the terms is fully reified at the type level. This is not the case in the GADT solution.

If you consider, say, a parsing function, then I would imagine that you can only give a reasonable implementation using the GADT approach since the structure (and hence type) of the expressions cannot be predicted by the type system in the type class approach.

Parser

Just to follow up, here's a thread with a couple of different parsers for the GADT example over on the Haskell-Cafe mailing list.

GADTs are closed?

I was just rewriting a function that used GADTs to use type classes and one obvious difference is that GADTs are closed, while type classes allow you to extend them as you want.

Can't tell you more, though..

Encoding type classes into GADTs

This answer is really a bit late, but I just stumbled across your post by chance. My paper "polymorphic typed defunctionalization and concretization" (with Nadji Gauthier) shows a type-preserving encoding of type classes into GADTs. This proves, in a way, that "anything that can be done with type classes can be done with GADTs". However, for the encoding to be used in practice, you need either a closed world assumption or a form of extensible GADTs.

A Java version

This is even later, but I thought people might find it interesting to see how this example looks in Java. We can translate the GADT into a set of classes implementing a parameterized interfaces in a quite direct and type-safe way:

public interface Term<T> {
    T eval();
}

public class Lit implements Term<Integer> {
    int i;
    public Lit(int i_) { i = i_; }
    public Integer eval() { return i; }
}

public class Inc implements Term<Integer> {
    Term<Integer> t;
    Inc(Term<Integer> t_) { t = t_; }
    public Integer eval() { return t.eval() + 1; }
}

public class If<T> implements Term<T> {
    Term<Boolean> b;
    Term<T> e1, e2;
    If(Term<Boolean> b_, Term<T> e1_, Term<T> e2_) { b = b_; e1 = e1_; e2 = e2_; }
    public T eval() {
        if (b.eval()) return e1.eval(); else return e2.eval();
    }
}

etc.

The combination of subtyping and type parameterization is actually pretty powerful. Of course, the data abstraction enforced by Java leads to some verbosity relative to Haskell.

Red Herring: only works for some select examples

This problem has been described in detail (in the case of List, not Coproduct), in the 2006 work of Andrew Kennedy and Claudio Russo, Generalized Algebraic Data Types and Object-Oriented Programming. They point out something that I found really surprising and interesting.

There is a very clear correspondence between notions of GADTs in a functional setting (algebraic types whose constructors introduce existential types and use type equality constraints) and a counter-part in an object-oriented setting (classes whose methods universally quantify over types, and use type equality constraints), but those types and classes are not at all mapped one-to-one by the usual "sums into distinct classes, functions into methods" correspondence between functional and object-object languages.

That is, you can take a classic functional GADT example (well-typed expressions with an Eval function) and its object-oriented transcription will require no type equality constraints. Conversely, a completely trivial example (List.flatten, or Coproduct.flatten) will suddenly be impossible to express in an OO language without type equality (or subtyping) constraints.

This situation has resulted in people claiming that, for example, "Scala has GADTs" on the incorrect basis of managing to translate one paradigmatic functional GADT example without needing type equalities. If you look at the details, it turns out that Scala can in fact express GADTs through rather ugly workarounds (you can quantify on (forall x such that foo <= x <= bar), which adds a foo <= bar constraint, and therefore by abstracting over two dummy variables add an equality constraints between existing types), or abstract over an implicit runtime equality/coercion witness (which is what the Scala standard library does for List.flatten). You're out of luck with Java.

(Finally, in OO languages that are subtyping-heavy it is more natural to add subtyping constraints rather than type equality constraints. There is a further paper by Burak Emir, Andrew Kennedy, Claudio Russo and Dachuan Yu, "Variance and Generalized Constraints for C# Generics" that details this construction.)

The typed evaluator example does not need GADT

I think the confusion about the existence of GADTs in various languages comes because of a deeply unfortunate example of a typed interpreter. Almost any paper on GADT (except ours at the ML 2010 workshop) uses this flawed example. But it does not require GADT at all, if we use the tagless-final encoding, explained in detail in
http://okmij.org/ftp/tagless-final/course/course.html#lecture

The posted Java version seems to implement exactly that. In fact, the Java example is essentially the same as
http://okmij.org/ftp/tagless-final/course/final_obj.ml
that uses OCaml objects.

The ML2010 presentation about encoding of GADT with first-class modules deliberately used a different, and more practically useful example
http://okmij.org/ftp/ML/first-class-modules/index.html

That presentation also points out the litmus test for GADT (which first-class modules fail, btw). One can say that their language supports GADTs if they implement the Equality GADT and it supports injectivity. The injectivity, that is, deriving EQ a b from EQ [a] [b] is very difficult to fake.

"Term/type level type" is

"Term/type level type" is essential I think. If you have some dependencies on the term-level represented by some function you may lift this dependencies in the type level and thus check some properties at compile time and vice versa if you don't need some dependencies represented by type classes to be checked at compile time you may consider GADT's.

Expression problem

Just look into this. It is solution of expression problem with type classes.

This only works when you

This only works when you know the value at compile time, which in practice is never...

Misplaced comment?

I'm not sure what you're commenting on (I don't see the link with the post above, but I won't watch the video anyway, so I'll just assume that it's similar to "Data Types à la Carte"), but I'm tempted to disagree in any case.

In Data Types à la Carte a

In Data Types à la Carte a value has type Foo (A :+: B :+: C) where A, B, C are the constructors that are allowed to appear in the value. In the presentation above, a value has a type that mirrors its structure. So Add (Const 1) (Add (Const 2) (Const 3)) has type Add Const (Add Const Const). You can introduce an existential but then you have to decide at that point which operations you want to support, so then in my opinion that does not solve the expression problem because it lacks extensibility of operations. Perhaps I missed a key point because I went through the video quickly, in any case in the slides there is no solution to this problem.

Actually it was a homework

Actually it was a homework =) to make it usable for the real life i.e. at runtime. That is why I think it should be solvable, but I haven't started implementing it.