Multimethods over structural types

Hello,

I am interested to know if there are languages out there, or research projects, or people's wild private imaginings, that provide polymorphic multimethods over structural types.

I mean the non-parametric type of polymorphism where methods may have > 1 implementation. The dispatch mechanism would choose the most specific of those available for a given invocation.

I have found one mention of something that seems in this area, but I don't understand the paper enough to know if it's what I hope for:

Combining Structural Subtyping and External Dispatch
Donna Malayeri & Jonathan Aldrich

I'm not _even_ a language design noob, and have only recently found there is established terminology ("structural types") for what I've been edging towards for about 10 years.

But… I envisage that the dispatch mechanism would be able to examine implementations of a method for the signature they require of their arguments. Given a method invocation is would be able to find the most specific implementation given the types, or fail if there is no match or the match is ambiguous.

I also think that a kind of transitive closure is necessary here. I'll use an example to elucidate (the example is "single dispatch", btw):


def area(shape) # implementation 1.
return width(shape) * height(shape)
end

def area(shape) # implementation 2
return 2 * pi * radius(shape)^2
end

def width(shape)
return square_size(shape)
end

def height(shape)
return square_size(shape)
end

I can now evaluate:

area( {width: 2, height: 3} )
-> 6
# This is just as you would suspect – implementations 1 is used.

area( {radius: 1} )
-> 6.28318
# here implementation 2 is used.

area( {square_size: 2} )
-> 4
# here, the record itself doesn't directly fit implementation 1, but because of the "width" and "height" methods, it _can_ be used, so 1 is used. This is the kind of transitive closure.

Implementations's parameters are not decorated with explicit type declarations because the type an implementation needs for each of its arguments _are_ the methods it calls on those arguments.

So, if this kind of thing is going on all over the place and well known, I'd love to know. Or if it's been tried and found a terrible idea, that'd be good to know too :-) I find the possibility enormously exciting because I think it'd "fix" an awful lot of what I find rather hobbling about the languages I generally use. Especially if it could be embedded in some kind of live-coding environment.

Thanks,
Benjohn Barnes

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Favoring Explicit

You might also look into Clojure's Protocols, and into architectural approaches such as component-entity-system (CES) or behavior-object-tag.

I favor structural types, and I've contemplated approaches to typeclasses and multi-methods for them. My conclusion was: don't build them into the language. The centralization of control and knowledge has advantages for adaptability and expressiveness. But it also creates problems for security and local reasoning. To precisely understand the impact of installing a multi-method in an open system, we should install it at a subsystem level (not a 'global' install, because 'global' is typically just a poorly defined, ad-hoc process-level subsystem). So I favor architectural approaches to the concerns that multi-methods are intended to address.

The approach you're suggesting is doable, but easily leads to problems: (a) selection of implementations can be non-deterministic and difficult to guess (e.g. try area on a 'rounded-rectangle' that has both height, width, and radius). (b) it can be difficult to resolve without executing, (c) if executing, may need some sort of rollback in case of side-effects on a failing path.

Thank you, I'll take a look

Thank you, I'll take a look at those suggestions.

I'd not thought about the security implications before – Good area for consideration, although I think it would be something that could be understood better once it was available.

I think the library or language thing is a greyer devision, but there's no reason why what I've got in mind wouldn't work as a library, at least for a lisp like language, where a library can have access to program code. There are advantages to this being fairly baked in though, so that ubiquitous access makes it the natural way to build a system. I think it's also likely to benefit from the compiler or interpreter having a very close understanding of it for optimisation.

On your points:

a) There can be issues of ambiguity with the most specific implementation, although my current thinking is that these can be flagged during a compilation phase fairly easily.

The specific example you give is a great clarification.I don't believe to be a significant problem because it's a case of cowboy.draw() and artist.draw(). I saw this mentioned on LtU yesterday and I thought it was a brilliant description for the issue :-) In this case specifically, the rounded rectangle would have a "corner-radius" member, not a "radius" member.

More generally, namespaces would almost certainly be necessary in any kind of reasonably large system to ensure that syntactically equal, but semantically divergent names can't be confused. Or equally, to ensure that semantically equal but syntactically different names can be seen as equivalent.

b) What I've got in mind allows for the compiler to determine the possibility of an ambiguous call being made, but it would know nothing of the certainty that it will be. It also doesn't provide compile time checking that a method will be called with completely incompatible arguments. I have some optimism that this might be possible, but I've not really begun to think about it yet.

For me, static type checking is not a great requirement. IMHO few of the bugs I make in programs would be avoided by static type checking. Those I do make manifest very rapidly and I'd much rather see energy invested in better live coding and rich debugging / REPLs.

I find static type checking much more compelling when it is providing an element of dispatch, such as with Haskel pattern deconstruction. I've a suspicion there are other powerful advantages to static typing facilitating abstraction, and that the whole static error checking thing is a bit of a distraction between dynamic and static proponents. Er, but I'm getting off topic :-)

c) I'm happy for a program to just halt at this point, or throw an exception.

Symmetric difference types?

Since you're thinking of static typing, this is how I imagine writing the static types for your example while forbidding ambiguous usage:

width : HasWidth a -> a
height : HasHeight a -> a

area : ((HasWidth Num AND HasHeight Num) -> Num)
  XOR ({radius: Num} -> Num)

where
HasWidth a = {width: a} XOR {square_size: a}
HasHeight a = {height: a} XOR {square_size: a}

Could someone who knows about union and intersection types comment on this? I just tried looking up "symmetric difference type" for XOR here, but I didn't find anything.

Thanks, I'll look that up.

Thanks, I'll look that up.

Been there

The approach you're suggesting is doable, but easily leads to problems: (a) selection of implementations can be non-deterministic and difficult to guess (e.g. try area on a 'rounded-rectangle' that has both height, width, and radius). (b) it can be difficult to resolve without executing, (c) if executing, may need some sort of rollback in case of side-effects on a failing path.

A while back I put together function calls and backtracking semantics for a very similar effect, and it had these drawbacks. Fortunately, it looks like that's not what the OP has in mind. :)

Backtracking objection

I can better understand this now, I think:

One implementation approach would be to try to run all available methods and pick the best one at the end. An advantage if this is it's probably achievable with a library in most languages. It avoids the need to be able to parse implementations to understand their signatures.

I think the issues with it are too great though. However, it does suggest a middle ground that would allow testing – implement as a library without source access and have the developer specify an implementations signature by hand.

Or I could just get on with learning macros in Scheme or Scala properly.

CES & BOT

I've taken a look at CES & BOT, thank you. It's nice to catch up with Light Table again after seeing the KickStarter.

BOT's tags definitely have a flavour of what I'm after (Although I want to recursively chain on methods by virtue of actual arguments supporting the interface that the methods need – I'm not sure Chris Granger's tags do that, but they surely could). The specification of components that data has, and the implementation of associated functionality separately also seems related.

Thanks for the pointers.

Nimrod executes to resolve

Nimrod uses relatively classic multi-parameter dispatch, except it dispatches not only on types but on so-called "type classes", which are names for types predicate/constraints defined by a runtime contract that is dynamically run.

User-defined Type Classes (Nimrod manual)

type
  Container[T] = generic c
    c.len is ordinal
    items(c) is iterator
    for value in c:
      value is T

A type `Foo` satisfies `Container[Foo]` whenever, when `c` is replaced by a dummy value of type `Foo`, the code block belows compiles and returns `true`. This means that the "is" checks (dynamic satisfaction of types or type-class constraint) must all succeed, the method lookups in the code block must succeed for `Foo`, etc.

Note that no effort is made to recover from potential side-effects (except compilation failure) that would occur during such an execution -- or to even specify what would happen in such a situation.

Might want to look at Dylan generic functions.

The Dylan programming language has generic functions (see also c2 on GenericFunction and MultipleDispatch) similar to those in Common Lisp. An introduction at http://opendylan.org/documentation/intro-dylan/methods-generic-functions.html looks friendly to new readers.

Just for kicks, I worked briefly on a draft of a Dylan implementation in C++ in 1993, and the awkward part of dealing with generic functions (then) was not only most specific match at dispatch, but also where to dispatch when calling the next method, which is like calling a super method in Smalltalk. The Dylan spec deferred to Steele's writeup of multiple arg type ordering in Common Lisp, but I don't recall the section after twenty years. It was interesting Dylan's mid-90's spec said "go read the CL spec" for specific parts.

Edit: I should add a proviso that reflects John Cowan's point. Applying a pattern from multi-methods for nominal typing to a new context with structural typing is left as an exercise for the reader. (After reading c2 on NominativeAndStructuralTyping, think about implicit names based on structure; perhaps a cryptographic hash of canonical structure description is good for considering a series of questions.)

Dylan is nominally typed

Dylan, like Common Lisp, is nominally rather than structurally typed.

Prolog

Prolog would seem to fit the description:

width(Object, Width) :- member(width(Width), Object).
width(Object, Width) :- member(square_size(Width), Object).
height(Object, Height) :- member(height(Height), Object).
height(Object, Height) :- member(square_size(Height), Object).
area(Object, Area) :- width(Object, Width), height(Object, Height), Area is Width * Height.
area(Object, Area) :- member(radius(Radius), Object), Area is pi * Radius * Radius.

Hah! Yes.

Thank you for the thought!

I was familiar with DLV. "DLV is a deductive database system, based on disjunctive logic programming, which offers front-ends to several advanced KR formalisms." (what ever that means), although it's been a few years.

When I was using DLV I played about with some of the earlier ideas I had for this and they seemed to be well described with it.

:-) Do you pop up in other discussions of typing systems to mention that their approach is trivially specified by first order logic ;-)

No, that's unfair because going back to the DLV thing, I think it affected my thinking quite a lot (relational DBs too), so it would make sense if what I envisage has a lot of cross over. Definitely the image I have methods chaining primitive structural types outwards so that they pick up a kind of cloud of methods about them, is very similar to Prolog's unification (Is that the right term? – pun happenstantial)

I think Prolog as you show here would also get problems when an implementation is a subtype specialisation of another implementation because Prolog would instantiate atoms for both the super type and the subtype's implementations. I imagine fairly simple extensions address that though. They would need to explicitly tracking how specifically a given implementation's use of formal parameters matches the actual arguments passed.

Probably the plan I have for picking up ambiguity and the option for adding implementations to resolve it would work okay in Prolog.

I'm pretty sure I know how multiple dispatch can work, and possibly also, how dynamic variable bindings (as opposed to lexical ones) can fit in to the dispatch mechanism. I wonder if they might get a bit unwieldy with Prolog… but I don't really know.

Anyway, yes, thank you for reacquainting me with a forgotten influence! That's quite encouraging, in fact.

First order logic might be right.

Haskell has the same problem with type-classes (overlapping instances). You could take the same approach with prolog that the Haskell operlapping-instances extension uses to get around this and implement a meta-interpreter that takes best match first and returns the first match only.

Of course there is an argument that a sub-type is an instance of its supertype, and that sort of method overloading with different behaviours is bad (Liskov's substitution principle). First-order-logic may be correct. In which case a different method name, or resolving the overload ambiguity in the caller by grounding another term might be better approach, with a compile error when there is an ambiguous multi-method call

Edit: this is supposed to be in reply to the comment "Hah! Yes" but I am getting weird submission errors at the moment.

Pure

Pure has a single Herbrand universe for all its terms, so it is effectively structurally typed, and its rules have to match on all terms, making them effectively multimethods. It doesn't do full unification like Prolog does.

Thanks

I'll take a look at Pure, thanks.

Tangent

My toy language Tangent was going to do this, but I ran into undecidability issues. In short, if your structural types can be structured on functions, and global functions do multiple dispatch, and global functions can satisfy the structural typing - you (well, I) run into cases where you can't determine if the function satisfies the structural requirements of the type without first knowing if the function satisfies the structural requirements of the type...

Sorry, doublepost.

Sorry, doublepost.

You had typed structure fields?

Could you talk about this a little further? I presume you were also (implicitly?) typing the fields of the structures, rather than fields being dynamically typed?

I had explicitly, statically

I had explicitly, statically typed fields. It's hard to make a concrete example, but the core problem was that the type checker wanted to determine if B was a subtype of A.

So it checked if B satisfied A's structure. But A requires some function 'foo' -> A -> void, which B does not have. But there's a free function 'foo' -> C -> void. So we check to see if B satisfies C's structure, which would let it satisfy A's structure. And then it cascades from there, often into cycles/loops.

Implementation issue

Is this an implementation issue? What happens if you memoise the types you have proved already and just use the previous answer when it gets asked for again in the loop?

Structural subtyping should just be unification (ie pattern matching on types) and this is decidable providing you stick to first order types.

If loops, perhaps unbounded recursion?

I'm going to guess that if it can happen with types referring to each other, it can probably be arranged to happen with an infinite sequence of different types, so that memoizing wouldn't help. (As you say, seems unlikely to be first-order.)

(My own experience with nontermination and types was when I was, typically for me, trying to make types first-class objects in an interpreter with no phase distinction. A type had two methods, one to determine whether a given object was an ancestor, one to determine whether a given object was a member. I liked being honest about the computational nature of the type system; but the opportunites for bizarre behavior, both unintentional and deliberate, were appalling, and I eventually abandoned the project out of sheer fright.

Unless I'm missing something

I could memoize the types once proven, but they've not been proven at that point. I do that for other things (inheritance cycles for example), but it ran into issues for this particular problem.

It very definitely could be an implementation issue, or something I'm blanking on due to my informal training. Real life showed up, so I haven't had much time to play around with the hobby project recently.

Are free functions part of a structure?

Something seems odd there. In structural sub-typing I thought that whether a type 'a' is a subtype of 'b' should depend only on the structures of 'a' and 'b'. Should a free function be considered part of the structure of 'a' or 'b'?

I would like it to be.

I would like it to be. Consider basic things like string + int (for append this number to the string). Without some semblance of free functions as part of the subtyping you can't be guaranteed that this operation is available for more general types.

Normally this would only be an annoyance (see Java and C# generics), but Tangent's phrase concept utilizes this sort of thing quite a bit - perhaps to its detriment.

duplicate post.

duplicate post.