Comparing Approaches to Generic Programming in Haskell

Comparing Approaches to Generic Programming in Haskell by Ralf Hinze, Johan Jeuring, and Andres Löh. 2006.
You just started implementing your third web shop in Haskell, and you realize that a lot of the code you have to write is similar to the code for the previous web shops. Only the data types have changed. Unfortunately, this implies that all reporting, editing, storing and loading in the database functionality, and probably a lot more, has to be changed. You’ve heard about generic programming, a technique which can be used to automatically generate programs depending on types. But searching on the web gives you almost ten approaches to solve your problem: DrIFT, PolyP, Generic Haskell, Derivable Type Classes, Template Haskell, Scrap Your Boilerplate, Generics for the Masses, Strafunski, etc. How do you choose? And these are only the approaches to generic programming in Haskell. If you are also flexible in the programming language you use, there is a much larger variety of different approaches to generic programming to choose from.
[on edit: updated link to point to a more complete version of the paper]

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Yikes

I can't help but feel a bit disappointed that it takes 75 pages to compare these approaches. At first glance, it just reinforces the sense that there's an embarrassment of riches here... Sadly, I'm not sure if I'll be able to find time to read it, either.

Read the conclusions

I would then suggest reading just section 5.2 (and optionally 5.1) - it really reads like a single case statement :)

Question

Isn't Haskell already generic, in the sense that a function without type specifications can be used for any data type that implements the operations used inside the function?

Generic generally means

Generic generally means diffrent things in functional and object oriented contexts. In an more objective view a language is generic if it has both paramethic polymorphism and polytypism*. Object oriented have polytypism allready so adding paramethic polymorhism makes them generic. With Haskell its the otherway around. Haskell has paramethic polymorphism, but not polytypism (not counting typeclasses?).

*Polytypism being the ability to define diffrent implementations of a function for diffrent argument types. In objectoriented language you can do this for example by using inheritance and overriding methods.

Ad hoc polytypism

Inheritance gives you ad hoc polytypism at best, meaning that you have to provide implementation of a polytypic function (e.g., equality) for every new class.
Polytypic programming as discussed in the paper is not ad hoc - you define how to compare values of any algebraic type once and for all (though you may have an option to override this for some specific types).

"Genericity...

...always refers to the kind of polymorphism your favourite language does not have yet." There was a quote along these lines, but I forgot by whom, and Google didn't help.

Seriously, OO indeed does not give you anything close to polytypism (which, roughly, is what the FP and type community refers to when they speak of genericity). In fact, polytypism only applies to languages with rich type algebras, since it defines operations or types by induction on the structure of types (thus "type-indexed" functions/types). And rich type algebras are pretty much non-existent in most OOPLs with their preference for class-based nominal type systems.

Sure

Well, the cited paper agrees with your quote, they say:
Usually, the adjective generic is used to describe that a concept allows abstractions over a larger class of entities than was previously possible.
That's why I think it's very unfortunate the adjective "polytypic" is often replaced by "generic" (as in Generic Haskell).

But...

Still, OO programmers can easily get an idea of the kind of things people want to do with polytypism... In Java, for instance, these things are often done with reflection.

For instance, take a look at the reflective toString() implementation from Apache Commons. Reflection in this case allows us to write a single method implementation that works for classes of any shape, and then allows us to configure it, specialize it or override it as needed. Of course, we still have to do a bit of work for each class and this comes with a (sometimes severe) performance penalty, but lots of boilerplate has most manifestly been scrapped, and it's pretty clear how this approach is "generic" in a way that goes well beyond standard OO techniques.

Doesn't reflection imply...

...that the solution bypasses static type checking in favor of dynamic dispatch?

[Edit Note: Put another way, Java contains an ad-hoc, informally-specified bug-ridden slow implementation of half of Smalltalk. Ok, so half would be an exaggeration...]

Not if the type system knows about it

Metaphor integrates its staging constructs with reflection and existential typing.

Yes and no...

Doesn't reflection imply that the solution bypasses static type checking in favor of dynamic dispatch?

Certainly this is in general true of Java, but in this particular example, it's hard to see how the reflection could go wrong. In generating a toString() or equals() method, we're being guided by the information available via the reflection facilities, whatever it might be, rather than fishing for something specific.

Contrast this to other (more common) uses of reflection, like explicit class and method lookups. Consider for example something as simple as Class.forName(...), which can of course quite easily fail.

So in this case it doesn't seem to me like we're sacrificing that much safety by getting this information via dynamic reflection rather than statically. Performance may be another story. And anyway I'm only the slightest bit familiar with the Haskell work, so I hesitate to make any real claims.

(And yes, all in all the Smalltalk comment is probably only a the tiniest bit unfair...)

Polytypic is difficult without types

In Java, for instance, these things are often done with reflection.

I've done this in Java myself to trade generated boilerplate for a bit of runtime reflection and bytecode "enhancement". However, unlike Generic Haskell, Java erases a lot of type information (and it has much less of it to start with), so usually solving "polytypical" tasks becomes not as much of type-directed programming as naming-convention-directed and recently annotation-directed programming.

BTW, I just found a paper that looks relevant to current discussion: A Design for Type-Directed Programming in Java.

Generics vs. Design Patterns

The difference is that most researchers consider design patterns to be an open research problem. Haskell has generics but there are limitations.

Based on my limited understanding of what the paper is talking about, and being more comfortable with SML, I'd express the problem using a list and a tree. In ML, a list and a binary tree could be expressed as:

datatype 'a list = Nil | Cons of 'a * 'a list
datatype 'a btree = Leaf | Node of 'a * 'a btree * 'a btree

There are common things that we might want to do that would be quite similar, like scale or find the length. To find the length, we'd have:

fun length Nil = 0
  | length (Cons(_, xs)) = 1 + (length xs)
val x = length (Cons(1, Cons(2, Cons(3, Nil))))

fun length Leaf = 0
  | length (Node(_, lt, rt)) = 1 + (length lt) + (length rt)
val y = length (Node(2, Node(1, Leaf, Leaf), Node(3, Leaf, Leaf)))

Both length functions are generic in the sense that the type of 'a can be of any type. However, it is not generic in the sense that I could write a single length function that will operate over a list or a tree. There's a design pattern in there somewhere that could probably be described. But the problem with design patterns is that the code is not reused - just the structure of the code. So every time I added a type, I'd have to create a new length function based on the underlying pattern.

Design patterns require writing boilerplate

...and polytypic programming fights boilerplate.

One of the approaches to polytypic programming was called Scrap your boilerplate for a reason.

I guess that's just to say I agree with both your explanation and example.