Parametric datatype-genericity

Parametric datatype-genericity. Jeremy Gibbons and Ross Paterson. Submitted for publication.

Datatype-generic programs are programs that are parametrized by a datatype or type functor. There are two main styles of datatype-generic programming: the Algebra of Programming approach, characterized by structured recursion operators parametrized by a shape functor, and the Generic Haskell approach, characterized by case analysis over the structure of a datatype. We show that the former enjoys a kind of parametricity, relating the behaviours of generic functions at different types; in contrast, the latter is more ad hoc, with no coherence required or provided between the various clauses of a definition.

How could we have not mentioned this before?

The main result of this paper is that fold is a higher-order natural transformation. This means, the authors explain, that fold is a rather special kind of datatype-generic operator, both enjoying and requiring coherence between its datatype-specific instances.

We had several long discussions about the uniqueness of fold, which may serve as an introduction for those new to this sort of discussion. The tutorial on the universality of fold is, of course, on the papers page.

For some reason I feel a craving for bananas...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.


(This is a very cool paper!)

The main result seems to be that fold and unfold are natural transformations on a higher-order category. Even though definitions are relatively straightforward, I think it gets confusing to talk about hofunctions, honts, etc. I wonder if using 2-categories would simplify the presentation, like what John Baez talks about all the time in his column, eg. week80.

IIRC, natural transformations are simply the 2-morphisms in Cat. So it would seem that honts would be "regular" natural transformations on these 2-morphisms, or simply the 3-morphisms between the 2-morphisms?

The reason I ask this is: is there any hope to push this further and make a "kind-generic fold"? I'm assuming that there exists a natural "extension" of types into kinds that preserves everything necessary for datatypes, which I'm not sure is the case.

(It should be pretty clear that I know very little category theory, having only barely survived an undergrad course on it - feel free to tell me I'm horribly wrong!)


Translating this stuff into category theory isn't essential. It's sufficient to enumerate the coherence that laws folds are guaranteed to obey in a particular type system. The later approach is preferable, as abstract category theory presentations of these topics tend to obscure the results from programmers who could otherwise put them into practice.

To me, the paper's key point is that the Generic Haskell approach is ad hoc (resulting in fewer universal properties), whereas the Algebra of Programming approach is more lawful.

I realized a while ago that "zip" ought to be called "transpose", as the underlying operation is to translate a value of F(G(t)) to G(F(t)) using the underlying join. For example, you can naturally zip an array of dictionaries into a dictionary of arrays, where the resulting dictionary contains, for every index in the original dictionaries, the ordered concatenation of all values under that index from all of the sources.

Everyone has a fairly good understanding of functors and monads nowadays, but I feel like we're just scratching the surface of the generalizations that are possible regarding transformations between them.

Did anyone else realize...

...that fold() is an FSA emulator when we make the list a list of state transitions, and the initial value the FSA's initial state, and the application function apply a state transition to the state?

. . .

I guess this also presents another way to use fold--to implement a list monad.

And again, we might be able to use fold to implement some mechanics of STM.

Fold for program semantics

fold() is an FSA emulator when we make the list a list of state transitions

This is the essential idea presented in Fold and Unfold for Program Semantics (previously on LtU.) The paper is talking about using fold on functional evaluators, but of course evaluators can be mapped to FSAs (e.g., via this sort of correspondence).

Very well known

foldr c n for any c and n is a monoid action as I mentioned here. M-Sets are exactly the thing computer scientists use to (formally) represent state machines. Barr and Wells' lecture notes (an intro to category theory from a computer scientist perspective) have good discussion of this relationship and how it is viewed categorically (see particularly section 4.2.5).

I'm very interested in

I'm very interested in polytypic programming and was a big fan of representation types ala RepLib, so I'm very intrigued by this paper. I'll admit that it seems a tad beyond my current understanding though. I understand the conclusion (re: fold > Rep types), and can even abstractly understand why as I've always thought that folding over a type's structure is the only necessary polytypic operation, but I haven't studied enough category theory to follow the exact argument or how it translates to programming practice. If anyone can boil it down to some concrete type signatures, I'd be much obliged! :-)