Generics for the masses

Ralf Hinze. Generics for the masses. In Kathleen Fisher, editor, Proceedings of ICFP'04, Snowbird, Utah, September 19-22, 2004.

Mentioned (with no link) on LtU1.

Hinze shows how to program generically in Haskell 98, making extensive use of type classes.

Those interested in generic programming should make sure they are familiar with references cited in section 5 (many of which were discussed here in the past).

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

There must be a better way

It feels like there must be a better way to do this. I've been thinking about this for a while but a comprehensive solution isn't clear. But a few things do seem obvious:

First of all, function-overloading provides a more straightforward means of expressing this than typeclasses. This is even more so when one may abstract over the domain of an overloaded function. For example, given a reference "f" to a possibly-overloaded function, you want to be able to extract the type of f's domain using a syntax like "f.dom". This way, for a contrived example, you can write a generic function to take the sum of certain kinds of elements over certain data structures:

// The sum of a single integer is that integer.
sum(x:int):=x

// The sum of a character is zero.
sum(x:char):=0

// To generate the sum of a list of
// elements compatible with
// summation, we add up the sum of
// each element.
sum(x:list(sum.dom)):=
   if(x=nil) 0
   else sum(x.head)+sum(x.tail)
This way, we can say sum({})=0, sum(1,2,3)=6, sum({3,4},{5,{},7},"Hello")=19, etc. (Here, strings are arrays of characters.) A more useful example would be a generic framework for printing elements of generic types, including both values and containers like arrays and finite sets.

Second, this approach only works well in a closed-world environment. For example, to use a module containing generic printing functions for certain data types, and to use it in combination with your own generic printing functions, you would need to pass your own expanded printing functions in to the container's printing functions, so that it may recursively call both your sum functions and its own.

It appears that Haskell's type classes solve this problem by automatically generating implicit parameters for typeclass instances at their use site. Thus, identical typeclass invocations on identical data types may behave differently depending on their local context. And this is often desirable! This phenomenon is described in http://www.mail-archive.com/haskell@haskell.org/msg15080.html.

I think an optimal solution (given the requirements of a strongly-typed language, etc) is a combination of two orthogonal features: support for general overloading of functions (though perhaps with restrictions, like: the domain of the overloads must be disjoint), and support for lexically-scoped implicit parameters. Overloading is a convenience that avoids the need for all the manual plumbing in this paper, while implicit parameters would make generic types work in an open-world environment. Given these two features, plus dependent types, I don't think typeclasses would be needed in a language at all.

Re: There must be a better way

For example, given a reference "f" to a possibly-overloaded function, you want to be able to extract the type of f's domain using a syntax like "f.dom".

We don't need a f.dom feature if we define sum using a coalgebraic specification. If we can say x : {sum : self -> int} and read it as "x has type of something that can be applied to a sum operation giving an int" then the example don't need to use sum.dom:

sum (xs :: list {sum :: self -> int}) = ...;

This way we can write down the type of sum (i.e.{sum :: self -> int}) in this case and have overloaded sum functions with different arities in other situations.

Given these two features, plus dependent types, I don't think typeclasses would be needed in a language at all.

How could this be used to define a monad? In Haskell we can be sure that (>>=) won't work outside the defined monad (i.e. we can't write print 1 >>= \_ -> Just "Hello World" because Maybe and IO are different monad instances). I don't see how can we have this just using overloaded functions.

Re: There must be a better way

This sounds promising. I'd love to see any references you have on this approach. For example, is "self" context-independent or does it need to be qualified when nested several levels deep in definitions where its meaning is ambiguous? (Or am I misunderstanding?)

That monad example will fail to typecheck because, though ">>=" is overloaded, "print 1 >>= _" is of type "(t->IO t)->IO t", and is incompatible with the argument of type "t->Maybe String".

The other issue with replacing typeclasses with overloading and implicit parameters is illustrated by the monad "return" operation. "return" is of type "t->M t", so there is no way to overload it for different monads. If one needs to operate on monads generically, one stills has to bundle all of the operations for each monad into a data structure, in order to pass it around (either explicitly or implicitly). A record or typle would do just fine. This improves expressiveness: whereas there is no way in Haskell to operate on a typeclass instance as a first-class value or to have multiple instances of the same typeclass, explicit typeclasses make this possible. Cayenne (http://www.cs.chalmers.se/~augustss/cayenne/) is a Haskell-style language that supports this.

This highlights the strange multi-faceted role of typeclasses as a means of overloading by type, supporting implicit parameters, and bundling up of related overloaded things into data structures.

Re: There must be a better way

This sounds promising. I'd love to see any references you have on this approach.

Sorry I can't think of any references right now on this specific subject. The main references for this idea are Bart Jacobs papers on Coalgebra and some OO concepts/languages (e.g. multi-methods, O'Caml-like type inference for objects, E's lambda-based objects).

For example, is "self" context-independent or does it need to be qualified when nested several levels deep in definitions where its meaning is ambiguous?

AFAICS you can have it either way, using a regular type variable or making it context dependent:

-- using Haskell-like @
foo (xs :: {bar :: a@self -> {baz :: a -> b@self -> c} -> c})
-- or self is a regular type variable
foo (xs :: {bar :: a -> {baz :: a -> b -> c} -> c})

The type inference mechanism could work on a closed or open world assumption, depending on how generic you want your operations to be.

That monad example will fail to typecheck because, though ">>=" is overloaded, "print 1 >>= _" is of type "(t->IO t)->IO t", and is incompatible with the argument of type "t->Maybe String".

But what mechanism forbids the programmer to write (m :: IO a) >>= (f :: () -> b) = f ()?