Signature Based Polymorphism, Structural Subtyping and Duck Typing

Hello all,

I could really use some help in enumerating the various languages which provide support for signature based polymorphism, structural subtyping and/or duck typing. Thanks in advance!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Probably a long list

Wouldn't all the OO dynamic languages qualify as having duck typing: Smalltalk, Perl, Python, Ruby, Lua, and many, many more. And, IIUC, the ML family would qualify as having signature based polymorphism.

What about structural subtyping?

Yes, I see.

Now where does "structural subtyping" fit into all of this? The duck typing doesn't really seem to qualify, because it has little to do with typing or subtyping, and is simply an issue of runtime method resolution. Isn't "structural subtyping" and "signature based polymorphism" roughly interchangeable?

duck typing is structural typing

Runtime method resolution is how it's implemented, but the Python type system behaves like one with structural subtyping, though it is dynamic instead of static.

What types?

Since all these languages don't have types at all, where does subtyping enter the picture? How's DoesNotUnderstand any different from not finding a key in a FiniteMap? We don't think of the latter as a type error, so the former is not a type error either.

No subtyping involved in those languages

Duck typing is a term usually associated with dynamic (latent / runtime) type languages. Smalltalk with its dynamic dispatch is the quintessential example of duck typing. One can argue whether duck typing should be sanctioned as officially accepted terminology, but then one would have to argue that duck typing is an oxymoron.

but then one would have to

but then one would have to argue that duck typing is an oxymoron.

Because in the dynamic (latent / runtime) type languages, there really isn't any subtyping going on, correct?

Ocaml has duck typing

on both objects and modules. If the object supports the required operations, it can be converted into that type. I beleive this means all the ML derived languages- SML, Haskell, Ocaml, and others, have this feature.

Not all of them.

In Haskell, if two types have operations required by context, they can be used in that context.

And, maybe, they can be converted into each other.

I don't follow

In Haskell, if two types have operations required by context, they can be used in that context.

Doesn't that imply signature based polymorphism / structural subtyping? If a type can be used whereever another is required isn't that a form of subtyping?

Signature-based

Signature-based polymorphism? Yes.

Structural subtyping? No.

Given the value of some type we can use it only in polymorphic functions whose context agrees with type capabilities.

If some function require another type we cannot use it there, because that function isn't poymorphic.

And, you better take a look at Haskell yourself.

Given the value of some

Given the value of some type we can use it only in polymorphic functions whose context agrees with type capabilities. If some function require another type we cannot use it there, because that function isn't poymorphic.

So Haskell polymorphic functions are essentially the same thing as C++ template functions.

For certain exceedingly

For certain exceedingly unsubtle and rather misleading values of 'essentially', yes.

Okay I'll bite

I am aware of the obvious differences between functions in an imperative C-style language and functions in a lazily evaluated pure functional language like Haskell, but, just to be clear, are there other fundamental differences between Haskell polymorphic functions and C++ template functions which you are hinting at?

Yeah. No mandatory

Yeah. No mandatory compile-time instantiation, for one. Nor is there any (partial) specialisation (though related effects can be achieved through use of type classes - constaints on type parameters are of the form "type x is in type class y", or in the case of multiparm classes "there exists an instance of class foo for types (x,y,z)").

C++ template functions may

C++ template functions may look like a polymorphic function in ML or Haskell, but the similarity is only superficial. The two are really quite different.

In ML or Haskell, you can think of a polymorphic function as a function which takes extra arguments for the type parameters, and type inference will figure out two things: first, when to insert the extra parameters into a definition, and second, it will infer the extra type arguments to a function call.

In the Ocaml definition

  let id = fun x -> x

You can think of this as "really"

  let id = Fun (a:type) -> fun (x:a) -> x

and you can take the application

  id(5)

to be

  id(int)(5)

However, this understanding is not accurate for C++. A polymorphic function definition stands on its own, but a template definition does not. Instead, a template definition is a clause in the definition of a recursive compile-time program.

You invoke this compile-time program by giving it a name and optionally some compile time arguments (usually types, but also literal integers) and then this is recursively matched against the template definitions until you finally expand into a template-call-free specialization. A "specialization" isn't (yet) a program, but rather a set of overloaded functions and classes. Then, another overloading resolution mechanism (which may be invoked as late as link time) actually picks out a particular program.

For example, the C++ definition

  template < typename A > A id(A x) { return x; }

looks a lot like the Ocaml definition above, but it's really just one clause in the larger compile-time template program. For example, we could add an overloaded definition

  double id (double x) { return x * 1.0; }

elsewhere in the program, and then the result of a template call can change. One thing that is similar to ML is that template function arguments can get inferred, so that a call

  id(5.1)

will get turned into

  id<double>(5.1)

But then the template instantiation machinery will get invoked, and it diverges from the ML/Haskell approach -- the C++ compile will figure out how to expand id<double>, resulting in a specialization with two elements (the template definition and the overloaded definition for double), and then the overloading resolution mechanism will pick out the double-specific definition we gave.

Thanks for the detailed response

Thank you for the detailed response, it makes perfect sense!

Implementation strategy

I would argue that the difference you describe is mainly an implementation detail. You can use the same approach ("monomorphisation") for (most) polymorphic definitions in ML or Haskell, and some compilers in fact do, e.g. MLton. However, most FP implementations don't bother with this implementation strategy, because it wastes space and particularly wrecks separate compilation. C++ compilers on the other hand are forced to use this strategy, because the awfully ad-hoc properties of templates basically preclude any other sane strategy.

IMO the more fundamental difference is that ML-style polymorphism and Haskell type classes are fully typed, while templates are untyped (until "concepts" arrive).

Compile- or run-time monomorphisation, semantical differences?

I have been wondering about the strengths and weaknesses of both approaches. Take a language which allows type classes. Intuitively, I would think that run-time instantiation of types in terms would give a semantically more expressive language than, say, the same language where compile-time all terms are monomorphised for types (which are subsequently erased).

Are there examples of definitions in ML/Haskell where overloaded terms cannot be monomorphised compile time?

Within Haskell98, no. As

Within Haskell98, no. As soon as you want to play with something like hs-plugins it's a different matter though.

Yes

Well, not in ML, but in Haskell. In Haskell 98 the sole possibility is polymorphic recursion, where a recursive definition recurs at a different type. Obviously, the set of required instantiations cannot generally be known statically then, and might actually be unbounded [*]. But polymorphic recursion is relatively rare.

Higher forms of polymorphism, e.g. existential types or higher-ranked polymorphism, which are supported by actual Haskell implementations and likely included in the next standard, also resist monomorphisation. This is not surprising, considering that existential types e.g. provide you with late binding.

[*] Actually, this is one of the reasons why C++ requires a limit on template instantiation depth: without it, you could express similar things, but the compilation strategy cannot handle them - the compiler would simply diverge on expansion.

A huge disadvantage for the

A huge disadvantage for the C++ approach is that you can't take the address of a template function. It doesn't really exist, and is simply a way of generating code. I agree that run-time types are fundamentally more expressive than compile-time only types.

No run-time types necessary

The ML/Haskell approach does not require runtime types. Types are usually fully erased during compilation. This requires a uniform representation for values of all types, though (that's e.g. why floats are usually boxed in these implementations, modulo local optimizations).

Type classes require a bit of runtime information, but this is not really types but just a vector of functions containing a class' methods (a so-called dictionary) - very akin to vtables in OO, but decoupled from concrete values.

Dictionaries

Why are dictionaries carried around run-time? I would expect that if the types are monomorphised compile-time, they are known, and therefor the lookup into the dictionary can be done compile-time also.

In comparing C++ templates and Haskell98 type classes, where is the pay-off? (In expressiveness that is, not in generated code size/longer link time).

Please see above

Dictionaries are needed because monomorphisation is not always possible. And the cases where it isn't are exactly those that are not expressible in C++, e.g. arbitrary polymorphic recursion, first-class polymorphism, existential types, etc., where the types at which a definition is used are not determined or even bounded statically. Please see previous posting(s).

Are dictionaries mandatory?

The Jhc homepage says:

Novel type class implementation not based on dictionary passing with many attractive properties. This implementation is possible due to the whole-program analysis phase and the use of the lambda-cube rather than System F as the base for the functional intermediate language.

More details here.

Not mandatory

You need some kind of runtime resolution. Dictionaries are just the standard implementation technique. Of course, one can use other approaches, e.g. actual type dispatching - which seems close to what JHC does (encoded with GADTs). But I haven't looked at the details and cannot tell what possible advantages that has.

I think the C++ semantics is

I think the C++ semantics is very different from the ML/Haskell approach, because it lets you do computation on the structure of types. This means that it's inherently non-parametric, and so you can't get data abstraction via parametricity.

Template meta programming vs type class meta programming

Isn't that a consequence solely of template specialisation? That is, it is not necessarily a core property of the template mechanism as such, but of one particular advanced template feature.

Note that type class polymorphism is not parametric either (in particular, you can have type-safe casts). And I think you can express similar type-level computations as soon as you activate enough extensions, e.g. functional dependencies, overlapping instances, and the like.

Motion-Types

Motion-Types has structural subtyping and type inference.