Type Checking Modular Multiple Dispatch with Parametric Polymorphism and Multiple Inheritance

Type Checking Modular Multiple Dispatch with Parametric Polymorphism and Multiple Inheritance by Eric Allen, Justin Hilburn, Scott Kilpatrick, Victor Luchangco, Sukyoung Ryu, David Chase, Guy L. Steele Jr.:

In previous work, we presented rules for defining overloaded functions that ensure type safety under symmetric multiple dispatch in an object-oriented language with multiple inheritance, and we showed how to check these rules without requiring the entire type hierarchy to be known, thus supporting modularity and extensibility. In this work, we extend these rules to a language that supports parametric polymorphism on both classes and functions.

In a multiple-inheritance language in which any type may be extended by types in other modules, some overloaded functions that might seem valid are correctly rejected by our rules. We explain how these functions can be permitted in a language that additionally supports an exclusion relation among types, allowing programmers to declare “nominal exclusions” and also implicitly imposing exclusion among different instances of each polymorphic type. We give rules for computing the exclusion relation, deriving many type exclusions from declared and implicit ones.

We also show how to check our rules for ensuring the safety of overloaded functions. In particular, we reduce the problem of handling parametric polymorphism to one of determining subtyping relationships among universal and existential types. Our system has been implemented as part of the open-source Fortress compiler.

Fortress was briefly covered here a couple of times, as were multimethods and multiple dispatch, but this paper really generalizes and nicely summarizes previous work on statically typed modular multimethods, and does a good job explaining the typing rules in an accessible way. The integration with parametric polymorphism I think is key to applying multimethods in other domains which may want modular multimethods, but not multiple inheritance.

The Formalization in COQ might also be of interest to some.

Also, another interesting point is Fortress' use of second-class intersection and union types to simplify type checking.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Most specific instance selection

Thanks for the post, Sandro. This fits in well with our recent discussions on overloading.

I wonder if someone can justify this mechanism on semantic grounds. Why is it a good idea to even be asking the question of which is the most specific instance? When we are defining a function by pattern matching, we don't list a bunch of patterns in random order and let the compiler tease out which pattern is most specific. It seems to me that there are two cases with overloads:

1. At least one of the overloads is aware of the other and there is a particular order in which overloads should be tried (using the first that applies). This is the case with typical pattern matching of closed algebraic datatypes.

2. The overloads are mutually unaware of each other. In this case, I would argue that no ambiguity should be tolerated. Even if one is more "specific" than another, that may be accidental and shouldn't be used to resolve the overload.

It seems likely to me that the idea of "most specific dispatch" arose out of an attempt to handle both of these cases with a single mechanism. I'd question whether that is a good idea.

Your second case is handled

Your second case is handled in Fortress using the meet rule, where the definitions of each overloaded function must form a meet semilattice, ie. in the paper's words, "The Meet Rule requires every pair of declarations to have a disambiguating declaration, which is more specific than both and applicable whenever both are applicable. (If one of the pair is more specific than the other, then it is the disambiguating declaration.)".

If I understand you correctly, you're positing a scenario where some function foo has definitions on types (t1, t2) and (t0, t3) but dispatched on (t1, t3), where t1 <: t0 and t3 <: t2. In this case, the second definition isn't aware of the first, nor the first aware of the second. The meet rule would require a disambiguating overload to resolve this case.

Edit: as to whether it's a good idea, I'm not sure. I think most of the difficulties arise from the pervasive subtyping of Fortress. I can imagine a simpler language with more controlled subtyping, like an algebraic constructor being its own type that's a subtype of the algebraic type, to be much more straightforward.

Not that case, exactly

For example if one module defines f(Nat) and another f(Number), I'm saying it is a mistake to not report the ambiguity just because Nat is more specific. It's possible the person writing the call only had in mind the Number definition.

Perhaps a more specific

Perhaps a more specific example where this would be undesirable might demonstrate why this would be a bad idea. I'm not seeing how your example differs from overriding a method f by inheriting from Number. The code calling f on Number would know that it's calling a method that could be overridden by any subtypes.

OOPS

I read this as "most OO languages don't have sensible semantics anyway, so how is this any different?" Fair enough, but I think you can often interpret object encodings in a somewhat sensible way, with methods contributing both an interface and a default implementation. A subclass can be seen as overriding the default implementations while building a record of methods. There is no "most specific type" based dispatch going on in this view, even if that would produce the same result. When we switch to multi-methods, there is no longer a corresponding (somewhat) sensible semantics.

When we switch to

When we switch to multi-methods, there is no longer a corresponding (somewhat) sensible semantics.

Isn't there? What's the meaning of "sensible"? The Fortress semantics seem quite sensible to me, which is why I asked for a specific example where you think it, or even a single inheritance overriding, isn't sensible and so I can understand precisely what that means.

Edit: consider the example I brought up earlier, of limited subtyping where sum constructors are types that subtype the sum type itself. "Most specific" is then simply pattern matching, and moreover enables a form of extensible sums and cases. There doesn't seem to be anything nonsensical about the principle of "most specific" itself, so what I'm trying to say is that I think your problem with Fortress or multimethods probably lies elsewhere, like with the pervasive subtyping.

"Most specific" is then

"Most specific" is then simply pattern matching, and moreover enables a form of extensible sums and cases.

But one that suffers from the problem of more specific patterns silently overriding more general patterns. In a modular setting, this looks like a misfeature to me. The "more specific than" relation looks like a proxy for "is intended to specialize". That seems dangerous when these patterns are spread among modules and the authors might not even be aware of each other.

Dangerous and Useful

Usually these specialisations are different algorithms that do the same thing. For example sorting is a family of algorithms. Current languages only impose syntactic and type constraints on the specialised algorithms in a family, but ideally there should be semantic constraints in the form of axioms.

However there will always be things that are difficult to enforce, for example how do you prove that vector addition is actually the same thing as integer or floating point addition.

Rust actually places constraints on where implementations can be declared, which is more restrictive than the axiomatic approach but much simpler to implement. Rusts rules are something like you can only implement a trait declared in another module for a type declared in the same module as the implementation. This leads to the common pattern of declaring a wrapper type in your module to allow you to implement library traits on generic type parameters (which might be instantiated with types declared elsewhere, so you need to wrap in a local type). This means you cannot redefine addition in integers, but you can define a wrapper and redefine addition on wrapped integers. Its overly restrictive because you cannot redefine addition on types that implement the integer-trait, because some of those types might already have addition defined elsewhere, but you can redefine addition on types that implement the integer-trait, and are wrapped in your local type wrapper.

Alternatives

Obviously these features have uses that would be affected if you just removed them and didn't replace them with anything else, but that's usually true for bad features as well as good ones. The question is whether the proposed mechanism is better than alternatives.

I'm generally skeptical of the approach to language design that starts with a mechanism and tries to tweak it until it has nice properties. I'd prefer to start by finding semantics that faithfully model the aspects of the problem we are interested in and then build a language around that semantics. "Most specific" dispatch still seems to me to be a fairly ad hoc mechanism.

Programming as Abstract Algebra

We want to model abstract algebra, so we start with implementing algorithms over common algebraic concepts (Elements of Programming, chapters 1 - 5). Then we understand the patterns of abstract algebra, we can look at algorithms over core computer science concepts like containers. It turns out the algebraic structure of containers can be modelled by a family of iterators, and so we explore the algorithms that can be implemented in terms of different iterators (Elements of Programming, chapters 6 - 9). Then we take what we have and look at how we can do something useful like rearrangements and rotations (which are very useful generic algorithms that turn out to have all sorts of uses, Elements of Programming, chapters 10 - 11). This includes algorithm selection which is where you want trait (or type-class) based dispatch, and specialisation, as you want to select the optimal rotation algorithm depending on the type of iterator supported by the container. This relies on the earlier decision that containers should only implement iterators that are efficient for that container (which follows the general principles of abstract algebra).

So this is all part of a disciplined mathematical model, and not ad-hoc in any way. Reading Elements of Programming should make this clear. Watching this video is a good place to start: https://www.youtube.com/watch?v=Ih9gpJga4Vc

a use of lazy eval or something more sophisticated

I like the idea of specifying algorithms in the simplest forword form, and yet being able to pull data through layers of them as needed.

But lazy eval at least as implemented in haskell has the idea that if you can access an old calculation, it has to be persistent for the life of the program - it has no optimization that says it's sometimes better to stop caching and recalculate later.

I once tried to do this sort of thing with layers of ring buffers and optimized loops using templates in c++ (I was doing sound processing).

I never figured out why pulling data through many layers of these was slow - I thought I calculated the size and use of the buffers to not have to recalculate on normal use.

But I like this idea best:

YOu specify the algorithm in its simplist form, maybe with some small hint about needed overlap and expected block size, and another program optimizes the buffers and partial generators behind your back.

In other words I want a new model of computing based on metaprogramming!

Lazy evaluation is a misfeature

The problem with lazy evaluation is that the performance is very unpredictable and difficult for people to understand. Having spent some time writing large programs in Haskell, I now think default lazy evaluation was a misfeature. I prefer Rust's eager approach, as performance is much more predictable and understandable. I don't think you actually need lazyness, as you can just use an object and an iterator. If you always access collections through an iterator interface (which I think is the right way to do it), then you don't need to have lazyness.

What do you think of the prospects

of a streaming processing library based on metaprogramming?

Assume we have a language with staging and/or just dynamic programming.

That could give us new models for writing libraries - run time template expansion etc. Subprograms that reoptimize based on use pattern etc.

Needs more careful

This includes algorithm selection which is where you want trait (or type-class) based dispatch, and specialisation, as you want to select the optimal rotation algorithm depending on the type of iterator supported by the container.

Which? Trait or type-class dispatch? Does it matter?

Traits = Type Classes

Haskell type classes are the same as Rust traits. See: http://science.raphael.poss.name/rust-for-functional-programmers.html

Edit: There are some differences, Haskell does not backtrack type-class instance selection, so it is committed choice. Rust does backtrack trait implementation selection, so if it finds a trait dependency unsatisfied, it will backtrack and try a different (overlapping) implementation.

Thanks

Yeah, I guess they're closer than I realized. I'm not really familiar with Rust. Isn't another difference is that traits are single parameter?

Not any more.

Rust now does full multiple dispatch on traits, and supports specialisation.

Edit: I realise now that the question was about multi-parameter type-classes not multiple-dispatch, but they are related, as are associated types. You can see Peano numbers and arithmetic on them here: https://github.com/paholg/peano/blob/master/src/lib.rs which demonstrates that you can do multi-parameter type classes. In effect I think "impl X<U, V> for T" is the equivalent to "instance X t u v".

The meet rule ensures

The meet rule ensures there's no ambiguity so I'm afraid I still don't see the danger. Again, a specific, even contrived example to illustrate the dangers would be helpful.

The meet rule doesn't rule

The meet rule doesn't rule out multiple matches (what I meant by ambiguity). It only ensures that when there are multiple matches, there is a most specific match (what you meant by ambiguity). But how do we know that picking the most specific match is appropriate?

Most Specific Match

Because the most specific match takes advantage of the most information in choosing the algorithm. You would only supply a more specific match if it was more efficient than the general case.

In the end your argument seems to be that the most specific match could so something other than provide a more efficient algorithm. This however is not a good objection, it is like saying a function "sort(x)" may not actually sort. There is nothing enforcing that the function name is an accurate description of what the function actually does. The "sort" function could delete the contents of the collection instead. If you accept that you have to trust the implementer of "sort" to actually provide an algorithm for sorting, then it is no more dangerous to trust that someone providing a more specialised implementation of sort that is more efficient is in fact providing a more efficient algorithm to do the same thing as the general case.

Even if this mechanism was

Even if this mechanism was only ever used to supply optimizations (yeah... right), there would still be better ways of going about it.

Better ways

Other ways have been tried, and none of them have actually worked as well. Stepanov started by implementing Ada's generic container library which is built on modules and generics, however because of the problems with it, he abandoned this in favour of C++. If you have tried to write generic programs in both Ada and C++ using the STL, you would find that the C++ implicit approach is much better to program with. I am sure Stepanov can explain the problems with the modular approach much better than I can:

I had participated in several discussions early on at Bell Labs about designing templates and argued rather violently with Bjarne that he should make C++ templates as close to Ada generics as possible. I think that I argued so violently that he decided against that. I realized the importance of having template functions in C++ and not just template classes, as some people believed. I thought, however, that template functions should work like Ada generics, that is, that they should be explicitly instantiated. Bjarne did not listen to me and he designed a template function mechanism where templates are instantiated implicitly using an overloading mechanism. This particular technique became crucial for my work because I discovered that it allowed me to do many things that were not possible in Ada. I view this particular design by Bjarne as a marvelous piece of work and I'm very happy that he didn't follow my advice.

Strawman

I'm not proposing Ada. Suppose I accept that Stepanov tried a dozen approaches and found C++ to be the best. Are you seriously advancing that as an argument for the optimality of this approach?

Best approach I have found.

No, I don't think its optimal. There may be some abstraction that is better that nobody has thought of yet. My suggestion would be to try and write the example generic code from Elements of Programming using the technique of your choice, and if you can implement all the algorithms in the book, and the code is more elegant and understandable than the C++ in the book then you should publish something about it, because I'm sure lots of people would find it valuable.

I am finding Rust provides a neat implementation of the algorithms, with more safety than the C++ versions, and the trait bounds, although more verbose, provide clear documentation of the dependencies of generic functions.

The most specific match has

The most specific match has more knowledge than all other matches, by definition, so how could it not be the most appropriate? If some existing behaviour is sufficient, then it will simply dispatch to that, otherwise it will override with its own implementation.

I don't see how this is much different than instance chains for resolving overlapping type class definitions, except it's a more uniform overloading model.

Overlapping instances are ad

Overlapping instances are ad hoc, too.

I'm confused, isn't ad-hoc

I'm confused, isn't ad-hoc overloading the point? I haven't even heard of a story for sufficient overloading that wasn't ad-hoc.

Last post for today

I thought I was making a useful point, but all evidence suggests otherwise. I'm going to recap where I'm coming from and take a break for today.

Overloading can mean a number of different things. Fully ad-hoc syntactic overloading should be minimally ambiguous to the user in a lexical scope (you want the user to be able to look at the code and tell which use is intended). Most of what we're talking about seems to me more meta-programming than overloading. Types are being used as meta-data and we write meta-programs that need to support extensible / open cases. The question at hand is how to organize these meta-programs. Jumping to "most specific" dispatch seems very ad hoc to me.

Meta Programms

Well type-classes in Haskell are a form of committed-choice logic programming, and Rust trait resolution has backtracking, so this is even more like programming in Prolog. So yes, you can view this as logic meta-programming, and in this light you can ask, "Is most specific match the best option?". Well first match seems a really bad option as you cannot easily provide new algorithms for specific cases, and random match seems even worse. Whichever way I look at it, most specific match seems to be exactly what you want.

I think it's a useful point,

I think it's a useful point, but I'm just having trouble grasping what you would consider a more principled/non-ad-hoc approach.

I agree multimethods enables some interesting metaprogramming-like programs, as do many other languages that lift previously second-class citizens to first-class status. Modular multimethods do this for pattern matching in my mind, which is why I can't quite grasp your objection since it doesn't seem fundamentally more objectionable than first-class cases.

Making anything first-class has its downsides too of course, but I'm having trouble understanding how "most specific match" would be one of those.

No ambiguity

Well, my language isn't inheritance based, but I don't want ambiguity or even overriding of methods. I was trying to understand the motivation for "most specific match". (This basically also answers Keean's post, so I won't respond there as well).

Can you explain how you see multimethods as making pattern matching first class?

My thoughts on multimethods

My thoughts on multimethods -> first-class patterns are a little half-baked, but I'll sketch out what I'm thinking.

Consider only the following changes to ML: an algebraic constructor is a subtype of its algebraic type, and all pattern matching is multimethod dispatch (as in this paper). Adding multimethods then seems equivalent to ridding yourself of the restriction that a pattern match must be fully defined all in one place, for good or ill.

Because this is ML, you'd still have first-class functions though. Reifying a multimethod as a value captures the set of types the multimethod is defined for at compile-time (perhaps via a union type or constraints). Now you can compose two such functions to cover any extended types, and it seems like you'd have the expressiveness of first-class cases.

The extensions above obviously aren't part of multimethods proper, but seems to follow naturally if you want to integrate with ML semantics.

Yeah

Yeah, that's in the direction of how I think things ought to work as well. I will point out that there is, to the best of my knowledge, no "most specific type" resolution in this approach. If multi-methods are implemented in a way that they're semantically just pattern matching, then I don't have that concern. The comment I made a couple of days ago about Open Recursion seems applicable here, too. It's nice to be able to treat them as first class values, but the linking support that you get in many languages with multi-methods is also nice and you might lose that with this plan if you're not careful.

Most specific instance

Answering the question: "Why is it a good idea to even be asking the question of which is the most specific instance?". If we have an iterator trait, bidirectional iterators probably extend forward iterators. We want to have trait based dispatch, so that the optimal algorithm can be selected based on the type of iterator used. Multiple dispatch is necessary is we have functions that take more than one iterator that may implement different traits.

Of course we would like people to be able to define new iterators and new generic algorithms, which may implement existing functions more optimally for the new iterators, without having to change the code in other people's modules where iterators and implementations of the generic function may be defined.

Regarding your concerns

Regarding your concerns about "most specific match", see the related paper of Hinze and Loh, Open Data Types and Open Functions. They discuss reasons why most specific match is an appropriate best fit strategy in the context of extensible/open pattern matching, which is closely related to multimethods.

Formalization 404

Unfortunately, the URL http://plrg.kaist.ac.kr/_media/research/software/ffmm_in_coq.tar.gz mentioned in the formalization paper is broken.