Namespaces for methods?

I'm working on adding namespaces to messages in my programming language. I've run into a few... strange... side-effects of it that I'd appreciate some feedback on.

The motivation here is that since classes are open to extension, namespaces will let you add methods to existing classes while avoiding collisions. For example:

// In document.mag:
class Document
    // Stuff...
end

// In render.mag:
def Document print()
    // Code to draw the document on-screen...
end

// In printer.mag:
def Document print()
    // Code to send the document to a printer...
end

// Later...
var document = Document new()
document print() // Which one?

Here, the two print methods collide, which is bad. To fix that, you can place the methods in namespaces. Dots are used to separate namespace prefixes (and are not used for method invocation, which just uses juxtaposition). We can disambiguate the above by doing:

// In render.mag:
def Document render.print()
    // Code to draw the document on-screen...
end

// In printer.mag:
def Document printer.print()
    // Code to send the document to a printer...
end

You can select a specific print method by either fully-qualifying it:

document render.print()

Or by importing the namespace:

using render
document print()

When it encounters a message like print it will search all of the namespaces imported with using until it finds a match. Right now, namespaces are searched in the opposite order that they are imported, and it dispatches to the first match. I may make it an error for there to be multiple matches.

So, here's where you come in (I hope). There's two consequences of this that feel wrong:

Dispatching a method is O(n) where n is the number of imported namespaces. Because the language is optionally-typed, there's no way to resolve the fully-qualified name at compile time. That means every dispatch will search the namespaces at runtime. I'm not too worried about performance, but it feels odd that we can't resolve the fully-qualified name once statically and stick to that.

The fully-qualified name that's resolved for a method may be different at runtime depending on the value of the receiver. This is the part that feels really strange. Consider:

class Foo
    def apple.method() "apple"
end

class Bar
    def banana.method() "banana"
end

def callMethod(obj)
    using apple
    using banana
    obj method()
end

callMethod(Foo new()) // "apple"
callMethod(Bar new()) // "banana"

At runtime, an unqualified name may resolve to a different namespace based on the method set of the receiver. That runs counter to my intuition of namespaces. I consider an unqualified name in a scope with an imported namespace to just be a shorthand for the one "real" fully-qualified name that it represents. This implementation doesn't work that way. Instead, an unqualified name with imported namespaces defines a set of possible names, any of which are potentially valid based on the receiver.

That's a valid semantic, but one that feels unusual to me.

So, questions:

  1. If a name is found in multiple namespaces, should it have a resolution order, or should that be an error? I'm leaning towards error, but I'm interested in hearing other opinions.
  2. Does it seem reasonable to have different objects resolve to different methods when given an unqualified name, or is that just a trap for unwary users?
  3. Any other pitfalls or landmines you can see that I'm not aware of?
  4. Any pointers to papers on related systems? I read the classbox paper which was interesting but didn't really go very far down the rabbithole. I'm not an academic so I don't know much about doing research to find papers related to a topic (a skill I'm rapidly needing to acquire). Pointers here would be super awesome.

Thanks!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Based on how it sounds like

Based on how it sounds like you think it should work, why not break up your protocol into two steps:

1. Resolve each unqualified symbol into a fully qualified symbol

2. Dispatch on fully qualified symbols

I'm unclear on the semantics of your definition 'def apple.method() ...'. What happens to the ambient namespace that Foo is in? Is it just discarded or appended? What would happen if you didn't have the explicit 'apple.' namespace modifier on method?

1. Resolve each unqualified

1. Resolve each unqualified symbol into a fully qualified symbol

2. Dispatch on fully qualified symbols

That's basically what it does. The weirdness is that step 1 may resolve to different fully-qualified symbols based on the value of the receiver at runtime.

I'm unclear on the semantics of your definition 'def apple.method() ...'. What happens to the ambient namespace that Foo is in?

Classes are first-class, so Foo is just an identifier for a variable. Like all messages, it can be in a namespace, but its namespace is unrelated to the namespaces that methods on an instance of Foo may use. Namespaces are just for names, so Foo's namespace is only relevant when you're dealing with the name Foo itself. (In other words, it only matters when you call a static method on Foo like a constructor. For example:

class blah.Foo
    def bloop.bar() print("bar")
end

def example1()
    var a = Foo new() // Error: can't find Foo
    var b = blah.Foo new() // OK: Fully qualified
end

def example2()
    using blah
    var foo = Foo new() // OK: Looked up in blah
end

def example3()
    using bloop
    var foo = blah.Foo new() // OK: Fully-qualified
    foo bar() // OK: Looked up in bloop.
              // Note that we don't need blah to find bar() 
end
What would happen if you didn't have the explicit 'apple.' namespace modifier on method?

It will default to being in the global namespace. There is also a namespace construct that will set the default namespace to use for all identifiers declared in it:

// These do the same thing:
class blah.Foo
    def blah.bar() ...
end

namespace blah
    class Foo
        def bar() ...
    end
end

There's no required correspondence between the namespace a class is in, and the namespace its members are in, by design, though many times I would expect they will be in the same namespace.

OK, your example answers my

OK, your example answers my questions and makes sense.

The weirdness is that step 1 may resolve to different fully-qualified symbols based on the value of the receiver at runtime.

Right, I meant to suggest that you look up the symbol name independently of the application. Have you considered having step 1 occur at compile time/load time? ie. scan all of the loaded code, build a complete set of symbols, and then bind to symbols early (bonus if you can distinguish definition from use at this time). You could still support more dynamic module load patterns, but might have to declare a few symbols in that case.

Maybe I'm biased to more static things, though. Your scheme doesn't seem any worse than what's in use in e.g. python, at least as far as I can tell.

This is a consequence of

This is a consequence of treating methods as string names which is prevalent in a lot of OOP langauges. Essentially a method call foo bar() eventually gets down to something that indexes foo with the string "bar". In my opinion, this is the wrong thing to do: names should be irrelevant at runtime. It can be fun to take such an approach to the extreme though, you get something like Tcl.

An alternative is to treat methods as first class values as in CLOS for example. Then methods get threated as ordinary values by the module system and everything works out automatically.

I'm consider CLOS-style

I'm consider CLOS-style multimethods but there are some issues I still need to work through. Either way, though, I don't see how that would help. If I had multimethods that were just specialized on the argument type, I'd still want namespaces for those methods so that you can have two methods with the same short name specialized on the exact same type and be able disambiguate them.

But maybe I'm missing something here. What do you mean by "methods as first class values"?

Then methods get threated as ordinary values by the module system and everything works out automatically.

Ah, I think I see. That presupposes some other kind of module system. :) Magpie doesn't have anything like that yet, but my hope is that namespaces would cover half of that (and an import system would cover the other half).

It's not that Magpie has specific support for namespaces for classes and namespaces for methods. It doesn't. It has support for namespaces for messages and (like Io) method calls and top-level names like class identifiers are both just messages.

The only difference between Foo and bar foo() is that the Foo in the first example has an implied receiver and in the second one, foo is being explicitly sent to bar. From the name resolution perspective, they're the same.

Seems to be mixing two things

It seems to me that your "namespaces" are just longer unique aliases for the unqualified name. When the qualification is removed by the using clause only the unqualified name is in scope and so the name clash is back.

To me your approach feels funny because the usual method of qualifying method names is to use the class name as the namespace, whereas your method uses a separate orthogonal namespace. So now you have two methods of resolving the name, qualified by namespace or by object dispatch if unqualified.

So my 2c worth on your questions:

1. I'd make inability to resolve multiple names an error, Taking most recent is dependent on the program order which could be modified by modules and separate compilation/loading. Trap!
2.
3. Unqualified names can't resolve by namespace so either they resolve by object or they are an error IAW answer 1. Resolving by dispatching on the object is a reasonably common OO idiom so I'd say its acceptable.
4. Because the namespaces are separate from the class names programmers can make mistakes like trying to call banana.method() on what at runtime turns out to be a class Foo object. There is a whole extra range of ways to make an error opened up by the separation of class and namespace.
5. Can't immediately suggest papers, might come back on that.

When the qualification is

When the qualification is removed by the using clause only the unqualified name is in scope and so the name clash is back.

In that case there won't be a clash, the names will just be unavailable. If you have an identifier named foo.bar the only way to resolve it is either by fully-qualifying it (foo.bar) or by explicitly importing foo and then doing bar.

So now you have two methods of resolving the name, qualified by namespace or by object dispatch if unqualified.

That's a good way to look at it. It's actually both resolution methods at the same time: object dispatch uses qualified names as well.

Taking most recent is dependent on the program order which could be modified by modules and separate compilation/loading. Trap!

It's only dependent on the order of using expressions in the surrounding scope, so it should be fairly easy to see, but I agree that having an error on collision is probably the safer bet.

Because the namespaces are separate from the class names programmers can make mistakes like trying to call banana.method() on what at runtime turns out to be a class Foo object.

That's actually valid if Foo happens to have a method whose fully-qualified name is banana.method. There's no requirement that namespaces (here banana) are the same as the name of the class that uses them for some methods.

In fact, the ability to have method names in different namespaces as the containing class is the real goal I'm going for here. That way a single class can mix in a number of different sets of methods in different namespaces without them colliding.

For example, let's say you have a simple List class. The class name may be in a collections namespace. The core methods on it like add() and remove() would probably also be in collections. To use a List in a for loop, it needs to provide and iterate() method. It might make sense to have that in a separate iterators namespace so that it's separated from the "core" list operations.

Objects also have a couple of reflection methods to do things like access their class and reflect on their fields. Those methods would be in a reflection namespace so that you don't trip over them all the time, but when you do want them, you can access them like:

using reflection
myObject getClass()
myObject hasField("blah")

instead of having to do something indirect and Java-like like:

// Here Java syntax:
Reflection.getClass(myObject);
Reflection.hasField(myObject, "blah");

My goal here is to get the nice syntax and polymorphism of regular method dispatch while avoiding name collision and objects turning into giant bags of undifferentiated methods.

From the example

the names will just be unavailable

So in your example function callmethod, how is the name "method" found, its in both apple and banana which are both imported by using? Is it now a purely dynamic binding on the class, if so what did the namespace gain you?

That's actually valid if Foo happens to have a method whose fully-qualified name is banana.method. There's no requirement that namespaces (here banana) are the same as the name of the class that uses them for some methods.

Of course its ok if the method is present, but in your example Foo has an apple.method() not a banana.method(). So now there will be a run-time failure to find banana.method() depending on what object it is applied to. This can't be resolved at compile time unless the object type is known.

From your examples of collections, iterators, and reflections it seems that you propose an idiom of using the namespace to collect an interface. But of course idiomatic programming is risky due to accidental or deliberate misuse or misunderstanding by inexperienced (in your language) programmers. Have you thought of making interfaces explicit?

Although I do like the ability to separate methods with the same name but different semantics, one of my problems with duck typing, eg compare the likely semantics of tree.bark() and dog.bark().

So in your example function

So in your example function callmethod, how is the name "method" found, its in both apple and banana which are both imported by using? Is it now a purely dynamic binding on the class, if so what did the namespace gain you?

It is dynamically bound. In this contrived example, it doesn't buy you much. A more (I think) realistic example would be:

class Foo
    def apple.method() "apple"
    def banana.method() "banana"
end

def callAppleMethod(obj)
    using apple
    using banana
    obj method()
end

def callBananaMethod(obj)
    using banana
    obj method()
end

callAppleMethod(Foo new()) // "apple"
callBananaMethod(Foo new()) // "banana"

In this example, without namespaces there would be a collision in Foo when it tried to have two methods named method. With namespaces, it can have both and code that cares about one or the other can specify it by using the appropriate namespace.

This can't be resolved at compile time unless the object type is known.

Exactly. That's my big problem with this system. Magpie is dynamic and doesn't have a well-defined concept of compile time, so there's no easy place where I can statically disambiguate an unqualified name.

From your examples of collections, iterators, and reflections it seems that you propose an idiom of using the namespace to collect an interface.

Yes, that's a good way to look at it, although there's no requirement that it be a 1-1 mapping of namespace to interface.

Magpie does have interfaces too, but they serve a separate purpose: They're part of the type system so are used to statically detect errors.

Although I do like the ability to separate methods with the same name but different semantics, one of my problems with duck typing, eg compare the likely semantics of tree.bark() and dog.bark().

Yes, that's how I was looking at it too. Languages like Ruby and JavaScript that allow extending core object types which can be really expressive but then discourage users from doing it because name collisions are so problematic. My goal here was to be able to solve the name collision problem so that extending core types can be a safe, usable idiom.

Its not clear if the "using

Its not clear if the "using apple" has to be within the definition of callAppleMethod() or if it can be in an enclosing scope. If the latter then it can be textually an arbitrary distance from the callAppleMethod() and that makes the choice confusing, so this definitely should be an error.

Even if the using must be in the immediate scope, the choice of first definition is unusual, and so likely to confuse. So unless there is a really persuasive use-case for it, I think it also should be an error. Making a language work different to the majority without good reason is programmer unfriendly design (as I've found out).

Its not clear if the "using

Its not clear if the "using apple" has to be within the definition of callAppleMethod() or if it can be in an enclosing scope.

It could be any enclosing scope. using expressions are lexically scoped like most other things that deal with names.

Even if the using must be in the immediate scope, the choice of first definition is unusual, and so likely to confuse. So unless there is a really persuasive use-case for it, I think it also should be an error.

I'm inclined to agree. This thread has been a nice sanity check and given me lots to think about.

2c

Does it seem reasonable to have different objects resolve to different methods when given an unqualified name, or is that just a trap for unwary users?

That's not reasonable. While messages from different modules/namespaces may have the same unqualified name part, they certainly don't have the same intended meaning.

You really should be able to statically determine the qualified name of each message.

I agree completely. What I'm

I agree completely.

What I'm stumped on is how to statically determine the qualified name of the message. I really really want to be able to do that, but I don't know how in an optionally-typed dynamic language. Any ideas?

Scopes are the issue

In my view, this hasn't anything to do with types. It's all about scoping, namely unqualified ("wildcard") importing of identifiers into an existing scope.

I don't have the time atm, but I'm also working on similarly extensible methods in my own PL, so I'll get back to this later.

In the meantime, I think Dylan's modules, Dave Moon's suggestions for Arc modules, and Chez Scheme's modules could be of interest.

Thanks, those links are

Thanks, those links are really helpful. The more I think about it, the more I realize this would get easier if Magpie was using CLOS-style multimethods to define methods instead of normal OOP-style defined-on-the-class single dispatch methods.

This has given me lots of food for thought.

Scoped Extensions

You really should be able to statically determine the qualified name of each message.

That won't work if munificent is aiming for generic programming. I.e. we extend Foo with a 'print()' method in order to pass it to a generic 'Printer' object. In this case, the Printer cannot be expected to know about Foo's namespaces.

As I understand it, the

As I understand it, the point is

  • to have method names that are pairs (module, local-name), so that methods from different modules with equal local-names don't conflict
  • and to be able to refer to methods from a module using only their local-names, by importing a module.

    we extend Foo with a 'print()' method in order to pass it to a generic 'Printer' object. In this case, the Printer cannot be expected to know about Foo's namespaces.

    [I have the hunch that we're talking about quite different things here...]

    In my understanding, 'print()' would actually be a pair, say (com.example, print). The Printer would either use that fully qualified name, or import com.example and use just the local-name print.

    All in all, this seems to me no different from any module/namespace-mgmt system that offers unqualified imports - "import * from module".

  • It depends on why you are

    It depends on why you are attempting to extend the class. The scenario I'm imagining involves some code of the form:

    def printer(o)
      o.print()

    Now, you might fully qualify that to:

    def printer(o)
      o.com::example::print()

    In this case, if we assume independent developers, Alice and Bob, extending class Foo to work with printer, they'll still end up colliding in the namespace - Alice and Bob will both use 'com::example::print'. Thus, we didn't really gain with respect to collisions.

    OTOH, if we just use any 'print', the calls would be ambiguous. And if we crawl the stack for context, things just get painfully difficult to grok.

    It isn't very clear to me the user stories for which munificent expects classes would be extended.

    Ah!

    In this case, if we assume independent developers, Alice and Bob, extending class Foo to work with printer, they'll still end up colliding in the namespace - Alice and Bob will both use 'com::example::print'. Thus, we didn't really gain with respect to collisions.

    Ah, now I understand. Reminds me of context-oriented programming, much heavier machinery.

    It isn't very clear to me the user stories for which munificent expects classes would be extended.

    A simple use case I can think of is: you have some data objects. You want to serialize them to different wire formats, so you define "xmloutput::serialize" and "jsonoutput::serialize" methods on them.

    As I understand it, munificent's desire is to be able to, in specific contexts, just use the unqualified name "serialize", and have it map deterministically to either xmloutput::serialize or jsonoutput::serialize, depending on the using clause in context. Just a convenient abbreviation, really. Which makes me wonder "Is it really worth it?" and brings us back to How important is language support for namespace management? ;)

    munificent's desire is to be

    munificent's desire is to be able to, in specific contexts, just use the unqualified name "serialize", and have it map deterministically to either xmloutput::serialize or jsonoutput::serialize, depending on the using clause in context.

    Yes, that's exactly right. I don't want there to be any polymorphism (which is what the current implementation does). It's just a textual shorthand. But I consider that a very important feature: giving people an ability to shorten names encourages them to make longer clearly unambiguous ones. My background is C++/C#/Java and I can't imagine living without namespaces.

    Long names

    There are some advantages to long names when comes time to grok code unambiguously, search code, et cetera. My habit in C++ is to spell out the namespace every single time - std::string, std::vector, et cetera - or use a typedef. I occasionally use 'using', but I don't like doing so in header files (and I write a lot of templated code).

    I like Haskell's approach of matching module granularity to file granularity, though I'd not want to use Haskell's massive dotted paths in the code.

    I've entertained the possibility of a wikified language, where names are in the form of wiki identifiers (each wiki name exports just one value, possibly a record). Then I read Gilad Bracha's "ban on imports" [ltu node] and Objects as Modules in Newspeak, which are convincing arguments IMO.

    After spending a bunch of time imagining all these other possibilities, I shelved the subject as low priority. I'll get back to it in a year or so.

    Objects as Modules

    You might see Objects as Modules in Newspeak for some inspiration. Newspeak is also designed for optional, pluggable typing.

    A little bit of JIT or caching should go a long way to resolving names at runtime while retaining a high level of dynamism.

    Extensions via Interfaces

    The issue seems to be that you're adding methods to a class directly. A layer of indirection might help. Consider:

    * instead of passing an unknown class Foo to a Printer, you pass a Printable object. (A little type inference would determine whether Foo is a Printable.)
    * the Printable interface, in this case, has a default instantiation - call 'print()' on the target. Default instantiations give you a lot of flexibility.
    * You can overload Printable[Foo] to have some other meaning.

    interface Printable 
      print();
    
    default Printable[o]
      print() { o.print(); }
    
    instance Printable[Foo f]
      print() { (f.getX()).print(); 
                (",").print();
                (f.getY()).print();
              }
    

    If all your interfaces have a simple 'forwarding' default, then you can easily decide whether a class has a default interface (latent typing) in addition to override or extend a class by instantiating the interface specifically for that class (or a supertype).

    Further, you may have one interface call on another (as a richer default), and it should be easy to statically disambiguate which interface you are using. Interfaces provide different 'views' of the same object. For fuller extensions, the interfaces could also be associated with type-signatures, and instances might introduce extra state/attributes.

    Magpie does have interfaces,

    Magpie does have interfaces, but they're a facet of the type system. Since the language is optionally typed, they don't have any runtime affect. Even if they did work closer to how most languages handle interfaces, I'm not sure if they'd address this. What if I had a single class that wanted to implement both Printable and OtherInterfaceThatHasAPrintMethod? How would I disambiguate?

    C# answers this with explicit interface implementations (which are quite clever), so maybe that would be the answer here too.

    Explicit interface implementations

    That's what I was using in my example above... e.g.

    instance Printable[Foo o]
       print() { ... }
    
    instance OtherInterfaceThatHasAPrintMethod[Foo o]
       print() { ... }
    

    You would disambiguate by saying which interfaces you are using. Interface names could also serve as namespaces (Printable::print).