Functional Pearl: Species and Functors and Types, Oh My!

Functional Pearl: Species and Functors and Types, Oh My! Brent Yorgey, draft.

We've discussed species many times before, and discussed at least one introduction for functional programmers. Here's another, this time considerably less technical. I would say this paper succeeds fairly well in giving a conceptual overview and some useful intuition. If you found the Carette and Uszkay paper sketchy in places, you may find this a gentler introduction.

On the other hand, I don't really see how it counts as a Functional Pearl...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Species and Functors

This is a great idea that's revisited and reinvented from time to time. While the paper's Haskell examples are great, this approach can work at a more fundamental level in a programming language by extending regular type definitions to functor definitions automatically. For example, in Haskell we write:

    data Tree t = Leaf t | Node (Tree t) (Tree t)
    treeMap :: (t->u) -> Tree t -> Tree u
    treeMap f (Leaf x) = Leaf (f x)
    treeMap f (Node a b) = Node (treeMap f a) (treeMap f b)

But, the treeMap definition here is redundant: it can be derived mechnically from the definition of Tree, by following a regular set of rules that apply to unions, products, function spaces, and so on.

Instead of defining "map f xs" on List as we do today, we ought to define just the type List and be able to have "List f xs" map automatically. If this approach is taken to the extreme, you can eliminate the notion of types and the static/dynamic separation between types and functions: types just become identity functions, and what were previously type definitions now become a general mechanism for defining functors.

deriving Functor?

From your example, I thought that you might have been suggesting just that Haskell support 'deriving Functor' for applicable type definitions.

But what exactly do you mean by types becoming identity functions?

deriving Functor

I just read up on GHC's new "deriving Functor" mechanism. Automating the definition of a types fmap is welcome, but is fairly indirect: from a type, it generates an instance of a type class and defines its fmap member, which entails a number of complications relating to module visibility, multiparameter typeclasses, etc.

If Haskell supported reasoning about values at compile-time, as dependent-typed languages like Cayenne do, then a type t could be equated to its identity function

   t_id :: t->t
   t_id x=x

and then the parametric type definition currently written as

   data List t = Nil | Cons t (List t)

could instead be the parametric functor definition

   data List f = Nil | Cons f (List f)

So, instead of defining the list type using Haskell's data syntax, you're defining the list functor using an extended data syntax. This function can be used for reasoning about identity functions, since a functor F applied to an identity function id::t-> yields a new identity function id'::F(t)->F(t). Thus "List Int" in the List-functors-with-types-as-identity-functions world behaves identically to "List Int" in the List-type-with-ordinary-Haskell-types world.

So, instead of defining two things (a parametric type T t, and a function fmap::(t->u)->T(t)->T(u), we would define a single parametric function F f. Given an identity function t, F t would equate to the old T t. And F f replaces the old fmap f.

I'm not suggesting Haskell be redesigned around an approach like this. But, if you're building a dependent-typed language, it's a big win.

What does this simplify?

When you define the functor (List f), doesn't that still need to create the List data type? Just declaring (List t) to be an identity function on (List t) seems circular. If all the same semantics for data types are still present, what's the advantage of making this identification? So that we can use the name of a type instead of 'id'? Or the name of a functor instead of 'fmap'?

Also, how do co-functors (or more complicated data types) fit into this? data Foo t = Foo t->Int, etc.


The real simplification is that you can write a single lambda-free definition that provides both the type and functor behavior without resorting to typeclass derivation tricks.

Deriving a functor from a parametric type definition requires that the type parameters be used with consistent variance. The previous examples were covariant, but the technique also works with contravariant uses, and scales up to multiparameter functors with mixed variance.

   -- Covariant functor.
   Bar t = Bar Int->t
   map f (Bar g) = Bar (\x -> g (f x))

   -- Contravariant functor.
   Foo t = Foo t->Int
   map f (Foo g) = Foo (\x -> f (g x))

GHC now derives functor

…this approach can work at a more fundamental level in a programming language by extending regular type definitions to functor definitions automatically.

GHC can now derive 'fmap', though the user guide does not explain for what types it can do so. This ticket explains in more detail.

Of course, you can also do this with DrIFT, derive, and generic classes.

You could also derive Functor with Generic Haskell, but it has bit-rotted (and even domain name rotted).

You might like...

{-# LANGUAGE DeriveFunctor #-}
data Tree t = Leaf t | Node (Tree t) (Tree t) deriving Functor

I can't keep track of which extensions have made it into the ghc distribution, but there may additionally be at least one for full-blown generic programming in there these days.


...this is less interesting than I hoped. :(

IMO, what we really want is the map action for the functor Tree is a fixed point of (what ML and Haskell programmers call "two-level types"). This action is really what lets you avoid writing boilerplate code. For example, suppose you have a little expression language:

  type exp = Var of string | Let of string * exp 
           | Plus of exp * exp | Times of exp * exp | Minus of exp 
           | IfZero of exp * exp * exp 

Now, suppose we write a substitution function replacing variables with numbers. We'll need to explicitly give cases for all the structural cases, which is entirely redundant:

let rec subst n x = function
  | Var y when x = y -> Lit n
  | Let(y, e1, e2) when x = y -> Let(y, subst n x e1, e2)
  (* all this is boilerplate *)
  | Var y              -> Var y
  | Let(y, e1, e2)     -> Let(y, subst n x e1, subst n x e2)
  | Plus(e1, e2)       -> Plus(subst n x e1, subst n x e2)
  | Times(e1, e2)      -> Times(subst n x e1, subst n x e2)
  | Minus e            -> Minus (subst n x e)
  | Lit n              -> Lit n
  | IfZero(e1, e2, e3) -> IfZero(subst n x e1, subst n x e2, subst n x e3)

But if we had the expression type's underlying functor and map action:

type 'a expF = Var of string | Let of string * 'a * 'a
         | Plus of 'a * 'a | Times of 'a * 'a | Minus of 'a | Lit of int
         | IfZero of 'a * 'a * 'a

let map_expF f = function 
  | Var y              -> Var y
  | Let(y, e1, e2)     -> Let(y, f e1, f e2)
  | Plus(e1, e2)       -> Plus(f e1, f e2)
  | Times(e1, e2)      -> Times(f e1, f e2)
  | Minus e            -> Minus (f e)
  | Lit n              -> Lit n
  | IfZero(e1, e2, e3) -> IfZero(f e1, f e2, f e3)

type expr = In of expr expF

Then it's easy to write a succint substitution function which only requires writing code for the cases where something interesting happens.

let rec subst n x (In e) =
  In(match e with 
     | Var y when x = y -> Lit n
     | Let(y, e1, e2) when x = y -> Let(y, subst n x e1, e2)
     | _ -> map_expF (subst n x) e)

Type fixpoints also give you fold for free

You certainly already know that, but it is also interesting to point out that given that "derecursified" type and its `map` operator, you can easily derive fold and unfold :

fold f = f . map (fold f) . unFix
unfold f = Fix . map (unfold f) . f

(Fix/unFix being the boxer/unboxer of your equirecursive types, and are unnecessary with isorecursive types)

On Species

The whole point of my (quite sketchy indeed!) paper with Gordon was that what Tim Sweeney says about regular Functors is actually true of Species in general. We tried to convey this, but we did not succeed. [There was not enough room for a decent introduction to Species, which Yorgey's paper rectifies AND then showing all the wonderful things one can do with species; we fell into the 'sketchy' trap while trying to cover too much instead of covering less ground but doing it well]. 'deriving Functor' and other nice features that Tim alludes to are possible in dependently-typed languages, are significantly easier to obtain, in non-dependently-typed languages with HM style types, for ground types based on Species (aka generalizing those based on regular functors).

I think, by now, a lot of people 'know' this. But no one has yet written it up properly [including me!].


Unfortunately, I don't think this paper is perfect, either, but the flaws are probably fixable in revisions. I could also see it being rejected for lack of heavyweight technical machinery. I agree that it bites off a more manageable amount for a single intro paper, but unfortunately one may be left with the feeling that this is merely a cute notation without a lot of new applications. It seems to be a tough tradeoff for a single paper, but we have to start somewhere. I do wonder if it will be accepted... The "folklore" problem only gets worse as the number of species papers circulated in draft increases...


Note that it was submitted as a Pearl, so it is understood that this is an introduction to known work (but with a rather nice Haskell implementation). I am completely convinced there are a lot of applications - see the concluding paragraph of Yorgey's paper for the areas where I (also) believe the main ones lie, which we'd tried to indicate in our earlier draft. Actually, Gordon and I are working on 'testing' (a la QuickCheck and SmallCheck) right now, because of exactly this motivation to show the usefulness of species.


I share your conviction and look forward to seeing the results of your work!


Despite being fascinated by ability to express some structures succinctly, I am a bit skeptical. It would be nice to have more results on expressive power. Right now, for example, it is not very obvious if one can express types like k-permutation, and if yes, then what is the encoding. Would it need multiple sorts?
In general, is the set of species combinators and basic species selected based on some principles, or mainly ad hoc?

species combinators and basic species

[I can't offhand answer your question about k-permutations]

The basic species are far from ad-hoc: they are called 'atomic species' and correspond exactly to certain group actions (exactly == bijectively). There is a fundamental decomposition theorem for species which says that they call be decomposed into sums of products of atomic species. The basic species that are then used in examples are the ones that "show up" earliest, since species come with a grading.

The combinators are a little bit ad hoc, but easily justified: most of them correspond to very natural operations on formal power series OR very natural operations on combinatorial structures OR very natural categorical operations. The main combinators are those which are in fact 'natural' in all 3 senses.

All of this is carefully explained in the Bergeron-Labelle-Leroux book (end of chapter 2).


The book he refers to is this one: Combinatorial Species and Tree-like Structures. It's an enjoyable read and also a must-read for anyone interested in species.

Thank you for

Thank you for clarification!

There is a fundamental decomposition theorem for species which says that they call be decomposed into sums of products of atomic species. The basic species that are then used in examples are the ones that "show up" earliest, since species come with a grading.

The part that confused me was that the notion of atomic was "not atomic enough" - some of atomic species are not basic (e.g., E product E, IIRC). Thus the decomposition theorem does not explain the choice of basic species. I think I indeed need to read the book... Thanks again!

Small clarification

E product E is not atomic, it is 'molecular'. E is atomic.

Bigger clarification

Sorry, I should have checked my notes - I meant E2 o E2 (composition).

I'll get back on this later

I'm at a conference in Paris without access to my own notes, and this is not an 'easy' question. I will return to this when I get back.