Parameteric Polymorphism from a C++ Perspective (good or bad?)

I consider myself an expert in imperative programming languages but I'm very green in the field of modern functional languages (I'd be better off alligator wrestling than writing monads, but I can still 'cadr' your 'cons'). Here is a quote from a well written post by Matt Helige ( http://lambda-the-ultimate.org/node/1455 ) which kept me up last night:

Things get more complicated (or is it simpler?) when you consider polymorphic types, like "α -> α" where α is a type variable rather than a concrete type. In a pure language with "parametric" polymorphism, there is only one function with this type. The word parametric in this context effectively means "type variables are black boxes": i.e., there's no way to say "if α = int, then..." or anything of that form. They're completely abstract. Once you see that, it's pretty easy to see that a type like α -> α can have only one implementation, or "inhabitant": since there is no conceivable way that the function can inspect its argument or construct a new value of the same (unknown) type, we can prove that any value with this type must be the identity function.

I didn't "see" what Matt was talking about until today. You have to realize that I think of C++ templates (and Java / C# generics) when I think about parameteric polymorphism. So I didn't understand where the restriction came from.

What had confused me was that in C++ the compiler does allow type choices by overloading template functions. Whether or not this is a "good thing" is probably debateable, so I figure why not debate it. Why or why not is C++ style parameterization where template T f(T x) {...} can legally have multiple implementations a good or bad thing?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

The thing about templates in

The thing about templates in C++ is that code is generated for each instantiation. That is, even if you don't overload, you still will get a new set of code generated for each type you use the template with.

And this is the only way C++ style overloading can work. What if you write a generic function f that is overloaded for a specific type. And then you write a generic function g that calls f but is not itself overloaded for that specific type.. Well, it cannot generate the same code for g for all types. It must generate code for each type, because it must call the correct version of f. Etc.

C++ templates were my first introduction to parametric polymorphism (or at least the first introduction that I understood) and they blew my mind when I encountered them. But after working with the equivalent in ML and Haskell... C++ templates really seem kind of silly and way too complicated. The whole mechanism of doing parametric polymorphism by textual inclusion of the function/class definition, and creating code for each instantiation, is just bogus. I suspect that the reason that particular mechanism was used is just because in C++ nothing is boxed, and so types have different sizes. In fact I actually think that a lot of C++'s bogosity is due to trying to shove features into the language without boxing things.

I believe that if you want different code depending on what type you are using, what you want is probably either OO or Haskell-style type classes or something along those lines... not parametric polymorphism.

Terminology

Things become easier when you realize that the term parametric polymorphism means different things in each of these cases. The concepts are related but distinct.

Can it?

Can muliple implementations of template T f(T x) {...} be created (while being truely parametric)? I can't see it. I'm thinking f still has to be the identity function.

Because of overloading you

Because of overloading you can implement a function that has some specific effect for one type, and is the identity function for all other types:

f(int x) {
    return x + 1;
}
template <typename T> T f(T x) {
    return x;
}

Whether this fits your question depends on the defenition of parametric polymorphism.

C++ templates are NOT parametric polymorphism

That may be a large chunk of your confusion. C#/Java generics should be (actually they should be bounded parametric polymorphism). C++ templates allow you to cover the cases that parametric polymorphism covers (it "subsumes" parametric polymorphism, at least the first order case), but it is not possible to determine whether a function is truly parametric by looking at its type, i.e. parametricity is effectively lost (of course, it's significantly weakened anyways due to side-effects). Here's some code to compare:

template <typename T>
T id(T x){ return x; }

does have exactly the same parametricity properties as the Haskell function,

id :: a -> a
id x = x

but,

template <typename T>
T f(T x){ return x*x; }

has exactly the same type, but has dramatically different parametricity properties, it's kind of like the Haskell function

f :: (Num a) => a -> a
f x = x*x

As you can see in the Haskell version (or in the C#/Java version using bounded parametric polymorphism), it is clear that we are not talking about any a, but only those that can be multiplied. f's type in Haskell is different than id's type. This is not the case in C++, their types are exactly the same. This means that you can make effectively no assumptions about a function given just it's type (assumptions related to parametricity that is). Furthermore, the function can make arbitrarily complex assumptions about the type of its arguments and not reflect that in the type at all, hence all those boost macros and template chicanery to try to make it clearer what exactly the expected assumptions are. As I said, while side-effects weaken parametricity properties we can still make some assumptions, i.e. a function forall a.a -> String, say, in a language with side-effects still cannot make any use of it's argument, however it is not required to be a constant function as it would be in a pure language.

I was a bit vague before, so

I was a bit vague before, so thanks for clrifying. The term "parametric polymorphism" is used in the mainstream imperative world to refer to situations where a function or class is polymorphic (e.g., works on multiple types), yet this polymorphism isn't the result of inheritance (i.e., subclass polymorphism) but rather on paremeters (most oftne type parameters, but other kinds of parameters are usually also possible). Hence, "parametric polymorphism". This isn't the same as having a polymorphic type in HM and similar type systems.

Another axis for this comparison is so called polytypic programming, systems like Generic Haskell etc. Just like the term "parametric polymorphism", the term "generic programming" used to refer to these ideas, is often used to refer to things like templates and generic using in Ada/C#/Java etc. This adds yet another dimension to the confusion...

it is not possible to

it is not possible to determine whether a function [template] is truly parametric by looking at its type

And the reason for that is that templates do not have anything like types in the first place - which is the most fundamental difference between templates and proper forms of polymorphism, parametric or not.

Must it be identity?

It would seem like in addition to identity, that "bottom"-returning functions would be acceptable, for whatever the language's various bottoms might be. In Java, for instance, a function that never returned, a function that immediately threw an unchecked exception, and a function that simply returned null could all be validly typed as a->a, all without introspecting or bounding a. Is it provable that no such functions are valid in languages with Hindley-Miller type inferencing?

Depends on the language

Once you add recursion (to a pure lazy language), there are two functions of type forall a.a -> a. One is the polymorphic identity function, the other is the constantly bottom function. (Derek Elkins alluded to this in the grandparent of Matt Hellige's post quoted above).

Of course, there are more than two terms of type forall a.a -> a. But any function of that type will be observationally equivalent to one of those two functions.

Parametric Polymorphism and Operational Equivalence (PDF link) builds up the operational machinery necessary to prove such equivalences.

Side-effects

I also alluded to the fact that (other) side-effects are just as effective at adding values to forall a.a -> a. For example,
the parametrically polymorphic equivalent of

template <typename T>
T id(T x){ format_harddrive(); return x; }

is (/would be, barring an effects system) well-typed and is certainly not observationally equivalent to the identity function. However, there are plenty of polymorphic strongly normalizing lambda calculi so it is possible to have a "programming" language where id is truly the only inhabitant of forall a.a -> a.

Some nitpicking to end:

Once you add recursion (to a pure lazy language), there are two functions of type forall a.a -> a.

Non-strictnesses is technically moot for a strongly normalizing language. Technically, laziness can still be mildly relevant, but that's splitting hairs. Further, non-strictness is not relevant. Adding general recursion to any strongly normalizing language leads to that type having the same values. The constantly bottom function is legitimate in a strict language too.