Overloading by return type without types

I wonder if there are any strategies for overloading by a type of a return value in dynamically types languages. There is one example I am aware about - Perl and its "contexts". May be there are some other insights on this topic?

Basically I mean run-time dispatch on the return values so that the code like so:

//This function is overloaded for int32
maxValue _ = 0x7fffffff

x = maxValue 0

can be translated into

//Overloaded (polymorphic) constant
maxValue = 0x7fffffff

x = maxValue ::: Int

without the need to statically type the program.

A hunch

My intuition tells me that it is possible in a completely dynamic way, but costly. Basically, what you need to do is to delay choice until it can be made, and this means keeping value in a "free structure" (in an algebraic sense) where you conserve as many generality as possible.

When you apply a "real" value on the overloaded value, you can use the type information of the "real" value to decide how to specialize the overloaded value; you "evaluate" the free structure at this point.

Haskellers have claimed in the past past that, for example, the following cannot be accomplished dynamically:


class Monoid m where
  neutral :: m
  mult :: m -> m -> m

foo :: Monoid m => m
foo = mult neutral neutral

instance Monoid Int where
  neutral = 0
  mult = (+)

instance Monoid [a] where
  neutral = []
  mult = (++)  -- list concatenation

bar = (2*foo :: Int)
baz = (1:foo :: [Int])

To transcribe this in a dynamic language, previous solutions have been to parametrize "foo" over a dictionary, or to use a different notion of overloading where "mult" is overloaded but "neutral" is not -- I've seen adaptations of monads to dynamic languages do that.

But you can also keep values in the "free monoid" (a piece of AST, really) until you acquire the dynamic typing knowledge you need. You would compile this into something like:


data DynData =
  | DInt Int
  | DList [DynData]
  | ... -- all the data packed in a unique data type, with dynamic checking
  | DMonoid FreeMonoid

data FreeMonoid =
  | Neutral
  | Mult FreeMonoid FreeMonoid

foo = DMonoid (Mult Neutral Neutral)

instance_FreeMonoid_Int Neutral = Int 0
instance_FreeMonoid_Int (Mult a b) = instance_... a + instance_... b

instance_FreeMonoid_List_a Neutral = DList []
instance_FreeMonoid_List_a (Mult a b) = instance_... a ++ instance_... 

mult_Int a b = Int (getInt a * getInt b)
  where getInt (Int n) = n
        getInt (FreeMonoid m) = instance_FreeMonoid_int m
        getInt v = dynamic_type_error TInt v


cons_List a b = DList (a : getList b)
  where getList (DList d) = d
        getList (FreeMonoid m) = instance_FreeMonoid_List_a m
        getList v = dynamic_type_error TList v

bar = mult_Int (Int 2) foo
baz = cons_List (Int 1) foo

This implementation is essentially equivalent to building closures/thunks that abstract on the dictionary, it only makes it more apparent that we are really manipulating data here. Abstracting on dictionaries is more general/homogeneous and it's rather how I would go if I wanted to built a prototype, to get an idea of the performance cost of this delaying in practice -- I expect not that much given the obliviousness of checks/boxing already happening.

By gasche at Thu, 2012-07-26 21:16 | login or register to post comments

In fact I do have a strategy

In fact I do have a strategy how to resolve this return type overloading for simple cases (and in fact your example does fall under this category). I can illustrate it using another example:

x `safeDiv` 0 = fail "Division by zero"
x `safeDiv` y = return (x `div` y)

res = 12 `safeDiv` 4 ::: Maybe

Here 'Maybe' is not exactly a type annotation but rather an explicit context (as in Perl). Type Maybe itself is a monad and implements fail/return (which are overloaded by return type). This is actually a classical exception handling pattern in Haskell - if you expect Maybe, Maybe is what you get, if you want Either, you will get Either. I can do this in a dynamic language as well. The trick is that run-time keeps track of this "Maybe" context. It can see that we expect a Maybe to be returned from "safeDiv". This context is propagated inside "safeDiv" function. This function has two "tail calls" - fail and return. As soon as these are *tail* calls, we can safely propagate the "Maybe" context inside them - and here is where the magic happens, fail/return require this context to be dispatched correctly and they get it. As a result, I can get away with just one "type annotation" (and without any "type annotations" inside safeDiv function).

But that this is a very trivial case. Here is another example:

foo x = return x >>= (\x <- ...)
foo y ::: Maybe

Here we annotate application of foo with Maybe, however, "return" function, that actually requires this Maybe context to be correctly dispatched, is not a tail call, a tail call is (>>=). And (>>=) is defined like so:

xs >>= x = join (fmap f xs)

And even here fmap is not a tail call, join is tail call. If we go further - they situation becomes even more complicated. And basically you would need a real type system to correctly resolve this situation.

If I understand your proposal correctly, it would have exactly the same problem in cases similar to this. And this is a "real life" example, pretty typical one.

By vorov2 at Fri, 2012-07-27 11:00 | login or register to post comments

The free structure arising

The free structure arising of your later example would be translated as:


data DynData =
  | ... 
  | DFreeMonad FreeMonad

data FreeMonad =
  | Return DynData
  | Bind DynData DynData

foo x = Bind (x, DClosure({x}, ...))

-- imagine you create a test `p` defined as `foo 1 == Maybe`:
p = case foo 1 of
    | DMaybe m -> m == Maybe
    | DFreeMonad m -> instance_FreeMonad_Maybe m
    | v -> dynamic_type_error TMaybe v

Your remark about tail-recursion can be seen as an optimization to make type information flow to certain part of the computation before they're reached by value, to avoid the creation then destruction of unnecessary free structures.

This can be done just as "easily" with an implementation abstracting over dictionaries:


data DynData =
  | ... 
  | DTypeClass (MonadName, Dictionary -> DynData)

data Dictionary =
  | Monoid { neutral :: DynData, mult :: DynData -> DynData -> DynData }
  | Monad { return :: DynData -> DynData, bind :: DynData -> DynData -> DynData }
  | ...

foo x = DTypeClass (\d -> bind d (return d) (...))

resolve_dict "Monad" TMaybe = (..., ....)
resolve_dict "Monoid" Int = (DInt 0, wrap2 DInt (+))
...

dynNothing = DMaybe Nothing

eqMaybe n m = case n, m of
  DMaybe m, DMaybe n -> DBool (m == n)
  DMaybe m, DTypeClass n -> DBool (m == specialize TMaybe n)
  DTypeClass m, DMaybe n -> DBool (specialize TMaybe m == n)
  DTypeClass (k1, f1), DTypeClass (k2, f2) ->
    if k1 == k2 then DTypeClass (k1, \d -> eqMaybe (f1 d) (f2 d))
    else DTypeClass (k1, \d1 -> DTypeClass (k2, \d2 ->
           eqMaybe (f1 d1, f2 d2))
  v, _ -> dynamic_type_error TMaybe v
where specialize ty (name, f) =
  case f (resolve_dict name ty) of
      DMaybe m -> m
      DTypeClass m -> specialze ty m
      v -> dynamic_type_error ty v
 
p = eqMaybe (foo 1) dynNothing

By gasche at Fri, 2012-07-27 14:27 | login or register to post comments

Sorry, I am either missing

Sorry, I am either missing something from your example or I didn't explain the problem correctly. I understand that with your approach we can delay a computation that requires dispatch by a return type. But as soon as this computation is delayed, it doesn't have a type yet (doesn't have a type tag) - and therefore it cannot be used for overload resolution of other functions.

And this is what we have in my example. We have an application:

bind (return x) ...

where bind is defined like:

join (fmap f xs)

xs here is our "delayed computation". However, function fmap comes from Functor and is itself overloaded by a second argument (xs). In order to dispatch fmap we *need* a type tag of xs, but this type tag is not available yet - we simply cannot calculate it at this moment, because we don't have any proper context, we don't know what is going to happen in join, it might just ignore its argument and yield something completely different.

Am I missing something?

By vorov2 at Fri, 2012-07-27 18:31 | login or register to post comments

It would be straightforward to extend the Perl solution

Basically, Perl does a limited amount of static typing and passes along the results (scalar, array, boolean, or void) to the callee. Instead, the compiler could do more static typing in the same way and pass the result along.

In the implementation of Lua, the caller passes to the callee the number of results it expects, and the callee makes sure exactly that number are returned. Smart callees can get access to this information and make use of it, just as in Perl.

By johnwcowan at Sat, 2012-07-28 01:16 | login or register to post comments

Overloading on calls and returns

Okay, first think carefully about the semantics you want, because there are two cases here. If you have two implementations of a generic function, both of which take the same type sequence in arguments, but one of which returns (say) a string and the other an integer, do you want the compiler to work out which one to call in a context where the integer will cause an error but the string won't, and vice versa? 'Cause I think that's a bit too implicit to be a good idea, it can easily become ambiguous, and besides it throws you back into runtime typechecking anytime you have a function that can return any of several types, or a return site that can accept the return of any of several types.

Or, given a function body which, even when given the same sequence of argument types, specifies the return of either an integer or a string, do you want the compiler to work out a way to return to the integer handling code in one case and the string handling code in the other case without generating a runtime type check? 'Cause that's something I know how to do, and I agree that it's a good idea.

If you're looking to do the first case -- overload a call by using a return type -- You can do that most of the time when your compiler can figure out a unique "expected" type at the call site (ie, one that won't cause an error to be thrown) and determine which version of the overloaded function has that return type (as well as the call types of the arguments at hand).

But the definitive information about return types in a dynamically typed language becomes available when the function returns, not when it is called, and this lends itself better to implementing the second case. Furthermore, taking advantage of it at that point is easier and allows typechecks to be skipped in more cases, even when a function can return any of many different types.

So what I think you want to do is have the function (when it knows what types it's returning) choose the address it returns to based on the value of the type/s it's returning. Remember that the code of a continuation can have multiple representations, just as can the code of a function. And only one of those code vectors (the one that handles "type unknown at compile time" cases) needs to have code for runtime type checks in it. Different return types just "resume" at different addresses.

In a compiler I wrote for a lisp dialect, I used overloading on calls and overloading on returns that both worked this same way (it was in fact the same code).

Given a function and sequence of known argument types, it calculated the hash value at compile time for that sequence. The hash value would be used to find the matching function entry point or basic block. The same thing was done for continuations and the sequence of return types to pass to that continuation (this dialect supported multiple return values); given a continuation and sequence of known return types, it would calculate a hash value at compile time for that sequence, and use that hash to find the matching entry point for the continuation.

That allowed function calls involving types that could be determined at compile time to be made directly, without doing type dispatch at runtime, and allowed returns to skip runtime return-value type checking and 'return dispatch' where their return types could be statically predicted given their argument types. In the few cases where typechecking of returns was still necessary, it was done in the function just before return.

The effect was synergistic because of the naturally nested expressions idiomatic in Lisps. The call types of the next function to call usually depend on the return types of the current function and the return types of the current expression usually depend on the call types it was given, so once a runtime type check detects a particular type signature in a call to a deeply nested expression, the types of the returns from that function, hence the calls (and returns) of the functions inside which it was nested, were usually determined.

So usually it would be able to dispatch from a runtime typecheck directly into statically-checked code vectors with no further runtime typechecks, skipping runtime type check and dispatch for potentially dozens of further calls and returns.

A happy side effect of this is that in most of the (many) code paths the thing would generate for a function, even explicit type checks specified by the source code could be skipped because the result for that particular code path would be predetermined.

One not-so-happy side effect of this is that you can easily wind up with literally hundreds of different entry/exit points that represent the "same" function and you have long compile times and large memory requirements generating and storing them all.

IIRC, the technique was initially developed by Manuel Serrano. I implemented it for the multi-argument/multi-return case after looking at his code for the STALIN scheme system.

I will be adding this feature to my current language 'when I get around to it', because performance-wise it's a big win. But in my current language, fast startup is a goal and there's no hard phase separation, so a slightly different (incremental during runtime) implementation is needed.

Ray

By Ray Dillinger at Mon, 2012-07-30 22:17 | login or register to post comments

I can probably explain with

I can probably explain with more detail, what I am trying to achieve. My language (http://elalang.net) provides a support for classes, which are somewhat similar to Haskell typeclasses. One can write the following class:

class FromInt a where
  fromInt _->a

Functions inside a class has a signature, which is used for dispatch only, not for type checking. "a" is a placeholder for a specific type, like a type parameter. A function "fromInt" here is "overloaded" by return type. Implementations for this function can be provided at any point of time, in different modules and are completely separated. For example, this is how you can implement it:

type ABC = A | B | C

instance FromInt ABC where
  fromInt 0 = A
  fromInt 1 = B
  fromInt 2 = C

Under the hood this is nothing more than just a run-time dispatch of type indexed functions. An ability to overload by return type is important, because a lot of things can be naturally presented this way - without the need to explicitly pass some dispatch tables, arguments, which values are not used, etc.

I believe that according to your classification I looking for the overloading of the first type. Moreover the return type of the function is statically available - there is of course certain "trust" here - as long as we have an instance of "FromInt" class for "ABC", we *trust* it to return "ABC".

Of course, this type of overloading can be done by explicitly annotating each application of such a function, but it basically kills the whole idea. So my intention was to provide a way for a caller to set an "expectation" of a return value, so that this expectation can be propagated inside the functions.

My current protype works like so (syntax is ad hoc):

succ a = a + fromInt 1

succ B -> ABC

You can explicitly set a context (either by putting done a type name or even a value from which the type tag will be extracted dynamically), and this context becomes a temporary global context that is used for dispatch. The context, set for "succ", is used for dispatch inside "succ" - so you can basically call this function with multiple contexts. (Contexts are also stacked, so you can always override an existing context).
And if somebody would try to apply a "succ" function without a context at all, it would raise a run-time error.

The code above can be rewritten without contexts like so:

succ dispatchValue a = a + fromInt dispathValue 1

succ A B

Here we assume that "fromInt" is defined like so: "a->_->_". Its first argument is used for dispatch only, its value is not used at all. Of course in this particular case you can write it shorter like so:

succ a = a + fromInt a 1

But this can be done with the context as well:

succ a = a + fromInt 1 -> (a)

Also the value is not always available and sometimes you have to explicitely pass it which results in a boilerplate code.

The problem is that I haven't seen a lot of examples of such overloading techniques in dynamic languages, and it is hard to understand, what are the consequences of such design decisions.

By vorov2 at Tue, 2012-07-31 07:51 | login or register to post comments

Expected return type

Well, if you're dispatching by expected return type, then under the hood, your caller has to know the expected return type. That means knowing what type your continuation (return point) can accept.

One consequence of this is that you can't call a function ignoring the return, the way you can in most languages, because if you ignore the return, then you can't infer the type information to fully constrain the dispatch.

Another consequence of this is that you can't return any value, the way you can in most dynamically typed languages; you're contractually obligated to return a value of a particular type.

Alternatively, you can allow these things, multi-thread your dispatch to all possible implementations of the function, and continue whichever thread doesn't raise a type error. But that's a bit eccentric and a questionable use of compute resources, at best.

Ray

By Ray Dillinger at Tue, 2012-07-31 08:47 | login or register to post comments

1. Unable to call such a

1. Unable to call such a function and ignore the return is acceptable. It will only be valid for functions that are overloaded by return type; moreover, as soon as this is a pure functional language ignore the return value is not a good thing anyways - currently compiler generates a warning in such a case.

2. Not sure that I completely understand this point. The whole idea of this overloading by return type is to be able to write functions that are obligated to return a value of a particular type. You can, of course, still write regular functions that can return whatever they can think of. This is, in fact, a pretty important programming technique, as it allows to imitate variadic functions, etc.
This limitation will only be for functions of specific type, and it shouldn't be a problem, I think.

When you call a function like "fromInt", I assume that:
- it is pointless to ignore its return (basically it is the same as not to call it at all)
- it is perfectly valid to expect something specific, e.g. "fromInt x -> ABC"

Running all available functions in separate thread is a bit extreme :) Imagine, that you might have a hundred of them. Also, as I mentioned in previous posts, an "expected type" is not always available locally - and you will have to defer an evaluation of... merely the whole program, and this will render a language completely lazy, which is not the goal.

I am more interested, how this can be aligned with the function purity. On the one hand I can "explain" these contexts as implicit function arguments. However, these arguments are not only passed implicitly, but sometimes can be specified implicitly as well, e.g.:

guard v true = return v
guard v false = default

guard 42 (some_condition) -> List

Here i specify a context for guard application explicitly, but this context then implicitly passed to either return or default. An ability of this "implicit context propagation" is pretty crucial - without it this whole thing would be quite useless. But I am a little bit unsettled with the idea that application like "return 42" would have different results depending on the context, set elsewhere.
It can be pure (even in an implementation, because this context can be literally passed to function as an argument), but it looks like that it is not.

By vorov2 at Tue, 2012-07-31 09:11 | login or register to post comments

Did you get what he puts in his hookah?

Personally, I found the class declaration a bit confusing, then somewhat got it with "want to dispatch on return type." Then thought I understood the instance declaration, but didn't see how he could dispatch on return type giving the mechanics. I completely lost it with the "succ" examples.

Vorov2, I really have no idea what you're doing except for that you "want to dispatch on return type" for which you, in a dynamic language, I guess, need to emit some kind of type promise a priory, before a function is evaluated?

By marco at Tue, 2012-07-31 09:16 | login or register to post comments

Yes, a promise - or, better

Yes, a promise - or, better to say, an expectation - have to be emitted a priory. For example, this is how this can be desugared:

succ a context = a + fromInt a context

succ some_value IntContext

In a class declaration a signature in the form "_->a" means:

[any type] -> [target type]

A "target type" is set by an instance declaration, so that whole thing is translated into:

[any type] -> ABC

Dispatch is done using the current available context (which is passed, explicitly or implicitly, by a caller).

By vorov2 at Tue, 2012-07-31 09:27 | login or register to post comments

+

fromInt, in a context ABC, returns either A, B, or C. I fail to see how you can add something to it?

By marco at Tue, 2012-07-31 10:14 | login or register to post comments

Why do I need to add

Why do I need to add something to it? I only need to set an ABC context - and that is enough for a dispatch.

By vorov2 at Tue, 2012-07-31 10:32 | login or register to post comments

Great

You have

instance FromInt ABC where
  fromInt 0 = A
  fromInt 1 = B
  fromInt 2 = C

So, fromInt takes an integer and returns A, B, or C.

Then

succ a = a + fromInt 1

Now here you add 'a' to 'fromInt 1', which returns either A,B, or C. Is '+' overloaded on all types, or is it the addition for integers? I.e., did you made a logical error there? If it's overloaded, or a typo, I understand

succ B -> ABC

as a request to evaluate 'succ B' in context ABC, and your example makes some sense to me.

(No offense. Looks to me that it'll work.)

By marco at Tue, 2012-07-31 10:41 | login or register to post comments

OK, I see. I guess my

OK, I see. I guess my example is misleading. Mainly because a "real" succ function doesn't work like that - it will be simply overloaded by its first argument for a target type.

Let's take a better, less misleading example:

open list

class Functor a where
  fmap _ -> a -> _ 

class Union a where
  join a->_

class Default a where
  default a
  
class Pointed a where
  return _->a

instance Default List where
  default = []

instance Pointed List where
  return x = [x]

instance Functor List where
  fmap = map

instance Union List where
  join = concat

xs >>= f = join (fmap f xs)

guard f x
  | f x  = return x
  | else = default

Sorry for such a big code listing, but at least this is a more or less real example. Now you can use this code to build monadic comprehensions like so:

comprehension n = 
  return n >>=
    (\x -> [1..10] >>=
    (\y -> guard odd y >>=
    (\_ -> return (x,y)
    )))

comprehension 1 -> List
/*
Produces:
[(1,1),(1,3),(1,5),(1,7),(1,9)]
*/

By vorov2 at Tue, 2012-07-31 11:02 | login or register to post comments

Cool

No, the long example is actually better. I get the gist of it. Stuff you'll probably wont be able to do is pass a function context, or dispatch on lists of integers and lists of booleans, or can you? It already looks quite usable to me.

By marco at Tue, 2012-07-31 11:21 | login or register to post comments

A function context is passed

A function context is passed implicitly - in other words, when you apply a function using a specific context, this context is automatically passed to all functions calls within this function. In my example you pass a context to a "comprehension" function, which in its turn pass it to "guard" function, and this context is used inside a "guard" function to dispath "return" and "default". That is in fact a thing that worries me a little bit, as soon as it renders functions to dependent not only from their arguments, but from some "implicit" context as well.

As for list of integers/list of booleans - you're right. Basically, there is no such thing as list of integers or booleans. There is just a list, it is always polytypic, e.g. [1,true,'c'] is completely valid. That is a benefit (and a downside at the same time) of a dynamic typing.

By vorov2 at Tue, 2012-07-31 11:31 | login or register to post comments

Lovely

Oh, I meant an arrow instance, like 'instance Arrow (->)', but forget about it. I read your language description. Really nice, a dynamic strict/lazy version of Haskell. Who would have guessed? Lovely!

By marco at Tue, 2012-07-31 12:12 | login or register to post comments

Thanks :)

By vorov2 at Tue, 2012-07-31 12:23 | login or register to post comments

eager dispatch vs lazy type negotiation

I'm not sure why Ray's post made me think of this, but it reminded me of type negotiation in compound document architectures, like OpenDoc in the mid 90's. (I worked on OpenDoc's structured storage, so I'm familiar with some of the nuances in semantics.)

You ask how to dispatch polymorphically on return type, but you might actually want something else to happen, if you expanded the scope of problem analysis a little.

If you think of call parameters binding (perhaps multi-method style) to something you can invoke, that in turn might be willing to return different flavors of value. The right flavor might depend on what the caller wants and what the callee can provide, and perhaps quality as well.

OpenDoc's approach was (basically) to return an ordered vector of lazy promises, so the caller can pick and choose which available flavor best matched content desired. Laziness causes multi-phase dispatch, so source and destination can negotiate, in a conversation something like this:

Caller: "Hey you, what do you have on tap?"

Callee: "Here's what I have, ordered by quality and fidelity, and sometimes whim."

Caller: "Yes! I've been dying for rich text markup. Give me all you have."

Callee: "Eeww. Okay buddy, if you insist. Hold on while I crank it out."

Externally, you see preference oriented data transformations, depending on what producers can create and what consumers want to receive. This might or might not be a better match for an application context in which you want return type based dispatch.

By Rys McCusker at Tue, 2012-07-31 02:28 | login or register to post comments

I am afraid that this

I am afraid that this approach will have the same problem, which was discussed here already.
Sometimes you might need this "deferred" return value (its type tag) to dispatch other functions, and this information - what you actually want to get as a result - might not be available locally at all.

You can take a look at my example with a "bind" function, which is somewhere above.

By vorov2 at Tue, 2012-07-31 07:54 | login or register to post comments

Thank you for posting that,

Thank you for posting that, I hadn't considered it. That's an interesting, flexible technique. It allows graceful degradation or conditional extension, in the sense that functions or return sites are allowed to have an ordered set of *preferences* for what type they want to recieve, and can work with other code allowing some variability in type representation.

That would allow you to write code that works with the current implementation of another part of the program, but is ready for (and will immediately start using when it becomes available) an extension you're getting ready to patch it with.

But in general, these things would have to be similarly-typed enough that the same set of values can be meaningfully represented in a choice of several different representations. So, in some way they are the same "type" (or in OO parlance, inherit via an "isa" relationship from a common parent type). So it could be used within a "number" type to determine int, float, rational, etc, or it could be used within a "text" type to determine sequence, array, unicode, ascii, etc. But extending it all the way to a base type containing disjoint sets of values would be useless.

Bear

By Ray Dillinger at Mon, 2012-08-06 19:36 | login or register to post comments

maybe useful in thinking about new problems

Flexibility can be either help or hindrance, depending on context. Solving a problem in a more general way that necessary often adds to over-engineering, ambiguity, and latent subtle bugs. So as a rule of thumb, extreme generality is a red flag demanding a lot of due diligence.

OpenDoc was trying to do something pretty strange, allowing multiple editor plugins to cooperatively manage different parts of one document arranged in an embedding relationship. An abstract api for interaction defines a protocol to communicate, but doesn't pin down what happens on the other side of an api call. So global static analysis and optimization was neither attempted nor expected.

If a single developer writes a piece of software and all parts are visible, there's less reason to go to extremes in flexibility, if something more direct and specific serves as well. Control knobs permitting variable behavior just turn into a verification and testing headache.

But if you want to sinter together efforts of many developers, loosely coupled, some kind of vague cooperation scheme is required as least common denominator to interact. Today folks use the web as that least common denominator. Problems in getting different web apps to cooperate are a bit similar in nature to problems in cooperating plugins. Browsers obsoleted OpenDoc's purpose.

Anyway, it's hard to get just the right amount of late-binding: as little as possible that fully solves a problem. Too much is rope to hang yourself. It seems the closer you get a to a user who wants to experiment, the more negotiation helps with connecting consumers and producers. (I'll stop because I'm starting to wander.)

By Rys McCusker at Tue, 2012-08-07 03:06 | login or register to post comments

Wandering can be good

Wandering can be good :)
Thanks for sharing. Indeed, the Wikipedia page on OpenDoc gives us strong historical clues that it was an early precursor, with offsprings which had quite a long run (probably too long already); namely IUnknown, IDispatch, IMoniker, IPersistStorage, et al.

Your "what do you have on tap", etc late binding metaphor depicts nicely a trend of ideas that, you're right, also had a number of other incarnations over/for the web itself, more or less successfully.

By Cyril at Tue, 2012-08-07 16:13 | login or register to post comments

PL refocus

Rehashing the past on this site needs a PL focus. My first comment was about a specific case of type negotiation to illustrate an idea which might work better than return type dispatch in some problems (or not). Then I tried to obliquely address a standard question which goes, "What are you trying to do?" The PL focus there might loosely be phrased, "What if you had to support dynamic runtime editor embedding in your programming language?"

In Smalltalk the answer is trivial, while in C++ it tends to be complex -- unnecessarily so to my taste. Maybe we can get a Ruby fan to chime in here: would that be hard or easy? :-) That's another language like Smalltalk, so it would take a lot of effort to get a result as complex as typical in C++ for dynamic dispatch to code linked at runtime. I'll pass on discussing C++ versions.

Even more fervently I don't want to talk about Taligent, except I can relate a short anecdote (without names) about Objective-C since this has PL focus relevance. I said something like, "Why don't we use Objective-C for anything? Dynamic dispatch is cool." The response surprised me, which resembled, "It's for losers; look at so-and-so who went off to NeXT, which will never go anywhere." Sounds face-palm worthy now, doesn't it?

A lesson for PL designers might go: if your language is really static, it will be quite irritating when something very dynamic really needs to occur. Thus a way to fit dynamic use cases into a PL model can improve fitness in more app contexts.

By Rys McCusker at Tue, 2012-08-07 20:20 | login or register to post comments

You're right, on the PL

You're right, on the PL refocus.

Back to the OP's concerns and questions, the best I can think of as a relevant source of inspiration to solve a problem likely not trivial, but still possibly in his language's design ballpark, is maybe to re-read about the (well-informed!) ideas found in Benjamin C. Pierce's Types Considered Harmful (just a rough intuition).

By Cyril at Wed, 2012-08-08 00:09 | login or register to post comments

eager dispatch vs lazy type negotiation

Overloading on return type may be useful in APIs where the caller and callee actively negotiate to find some common types both can handle. The OpenDoc case is like that. Microsoft OLE is like that. There, you don't have composition problems, because only one function at a time is being resolved.

The hard case for return type overloading is a composition of functions, such as

f(g(h))

where multiple versions of f and g are available. The problem is not designing a system that can make that decision. That's quite possible. It's that the programmer (and a later maintenance programmer) can't easily see what that decision was.

By John Nagle at Fri, 2012-08-10 17:06 | login or register to post comments

mechanism vs policy

I like the way you put your finger on basic problems with good intuition for central tensions. It's hard to design without a plan to address tensions. An over-arching plan I usually pursue sounds like this: as you go lower in a software stack, get progressively more narrow, precise, and deterministic. In other words, avoid fuzzy things at the bottom.

(The rest of my reply here may be off-topic in the sense it doesn't focus on return value dispatch. And it's not PL-centric either, which is worse on this site. But if you squint enough, part of the mechanism-vs-policy divide may apply to PL design if a language aims to support code organized this way explicitly.)

How does a programmer know what's happening? (That's rhetorical; I'm not really asking.) Explicit negotiation at high level is hard to miss, but implicit fuzzy type dispatch at low level is easy to miss, and bad things can happen when a coder doesn't notice something. Many bugs seem due to: "I didn't think of that." I'll resist going with an easy explicit-good/implicit-bad theme here, because that seems cheap and tangential. Composition with many layers is tough generally, because it's hard to see what happens several layers beneath your current focus, wherever that is. But if lower layers get simpler, with fewer degrees of freedom as you go down, it's far easier to reason about.

I asked "How do you know that?" as a child, then read about epistemology in my teens before going into information systems far too interested in models while deeply skeptical that specific models often tightly fit problems they address. I expect things to be broken, subtly, the more so as things stack and amplify any misfit. So I'm a critic of static ontologies, and generally detest the smell of tech starting with, "Well, now that we know the correct way to model the world ..." — no, you don't. (I don't really enjoy retreading subtly broken things.) I tell you this because personal mysticism about incompleteness in models must clearly bias my perspective.

Thus I favor approximating things, because I don't believe in exact fits. Since approximations stack poorly because they amplify error, you want fewer layers, to minimize drift. And at boundaries, it helps to re-sharpen contracts to something as specific as possible while still encompassing needs. Even better, if you can document exact nature of mis-match between a higher general layer and a lower specific layer, it provides a handle for reasoning so a coder proceeds informed. That I favor bottom-up coding is likely not surprising. It explains how I answer, "But what if there's a disconnect between top and bottom? Why not do exactly what the top needs?" Because there's always a disconnect. You just manage it; visibility helps here. If software tools don't help, you have a problem. Whether a hidden issue kills you depends on how critical it is.

After about ten years of going hard at C++ full time, I noticed I had developed a consistent strategy in code organization, which I used the next ten years without seeing a reason to stop pursuing by default. In short it sounds like this: policy upward, mechanism downward. (Oof, I just flashed on the green-side-up joke about laying sod.) When I split roles, I separate higher fuzzy policy decision from lower precise mechanism actions. Mechanisms on the bottom don't decide what to do, they just perform very precise tasks spelled out by contract. Policy on top chooses which mechanism to use in a context, given current circumstance, but doesn't tell a lower layer exactly what to do beyond setting options knobs, if any.

It appears everywhere in my code, at different granularities including function boundaries. If a function is too big and must be split in two, the caller does policy and the callee does mechanism. It gives a maintenance programmer fewer things to think about when task is localized: either decide what to do (and logic is about deciding and switching), or perform a precise task as directly as possible. But don't mix those two roles, unless it's trivial leaf code. Note this design style lends itself to explicit dependency injection, when lower mechanism layers let a caller pass in dependencies as arguments.

The same idea applied to classes just looks like normal object-oriented programming, which is organized so wider api calls narrower implementation, and it seems natural to put mechanism in specialized subclasses. Except when I do it, I don't care about domain modeling. I only care about splitting policy and mechanism into usefully divisible roles. Verb tenses for "to widen" and "to narrow" may appear in design discussions I have with other coders, because widening and narrowing is often what needs to happen, and everyone gets that. (I don't think I have ever heard a coworker use terms covariant or contravariant, and using them myself would make me look bad.)

You can probably distill all that down to one useful PL design question: how can a language support explicit division of policy and mechanism so a coder can see it and take it into account? (Except I used a lot of words so the exact meaning of that question is clear from sufficient context.)

Edit: I forgot to add a planned section on how I want high=dynamic/low=static as a divide in language design, or else in systems using more than one language. I want flexible dynamic switching on top, then deterministic (and efficient) static behavior on bottom. You can easily write that section for me, though. Or you can argue high level dynamic languages like Python already handle escape to C for leaf code local efficiency.

By Rys McCusker at Sun, 2012-08-12 23:50 | login or register to post comments

I have implemented support

I have implemented support for overloading by return type using context-like approach. This file contains an overview of this implementation:
http://elalang.net/docs/Article.aspx?p=whatsnew.htm

A more detailed explanation on (type)classes, polymorphic constants and contexts can be found here:
http://elalang.net/docs/Article.aspx?p=classes.htm

A new version of Ela is available here:
http://elalang.googlecode.com/files/ela-platform-2012-8.zip (Cross platform, but requires .NET or Mono)

By vorov2 at Sun, 2012-08-12 09:51 | login or register to post comments

Lambda the Ultimate

User login

Navigation