Dependent Classes

Vaidas Gasiunas, Mira Mezini, Klaus Ostermann. Dependent Classes. OOPSLA'07.

Virtual classes allow nested classes to be refined in subclasses. In this way nested classes can be seen as dependent abstractions of the objects of the enclosing classes. Expressing dependency via nesting, however, has two limitations: Abstractions that depend on more than one object cannot be modeled and a class must know all classes that depend on its objects. This paper presents dependent classes, a generalization of virtual classes that expresses similar semantics by parameterization rather than by nesting. This increases expressivity of class variations as well as the flexibility of their modularization. Besides, dependent classes complement multimethods in scenarios where multi-dispatched abstractions rather than multi-dispatched methods are needed. They can also be used to express more precise signatures of multimethods and even extend their dispatch semantics. We present a formal semantics of dependent classes and a machine-checked type soundness proof in Isabelle/HOL, the first of this kind for a language with virtual classes and path-dependent types.

I enjoyed this talk at OOPSLA, although I was not able to see the end, and I enjoyed the paper even more. There's been so much work on virtual classes in recent years, and while I very clearly see a strong practical motivation for this work, I admit that I find it difficult to keep track of the technical trade-offs between different approaches. This, plus the persistent limitations mentioned in the abstract, lends some of the papers an unfortunately tedious feel (to me). I find this work refreshing, since it introduces a substantial new idea in this area. (And of course, one of the authors posts here regularly...)

By Matt Hellige at 2007-11-06 06:25 | OOP | other blogs | 11821 reads

And a question...

In case the authors (or anybody) care to comment...

In the related work, the following appears:

In the encoding, E has an additional constructor parameter, but since it has a singleton type (a path), there is only one possible value which can be passed. In fact, it would be easy to devise an extension of νcⁿ, where constructor arguments with singleton type can be initialized automatically.

I wonder whether this has implications (or possible applications) for access control and visibility? Is it possible that the name E is in scope while its required singleton parameter is not? Would the suggested extension change this property?

More generally, is anyone aware of work that uses singleton types in this way? I don't really have anything specific in mind, it's just something that occurred to me...

By Matt Hellige at Tue, 2007-11-06 06:35 | login or register to post comments

If we assume that we have a

If we assume that we have a Java-like visibility mechanism (public/private etc.), then I think the main implication is that types containing paths (and hence field accesses) need to be checked against the visibility annotations.

I believe that this would also be sufficient to deal with the automatic initialization for singleton types - if the path-type is valid w.r.t. access control, then the body of the class could also access the same object as an expression (rather than a type) and initialize the field manually without violating visibility constraints.

By Klaus Ostermann at Tue, 2007-11-06 09:22 | login or register to post comments

Makes sense

That makes sense, and of course the body of the class which depends on a path could also access the path as an expression. I was thinking more of the client, though. If we don't have automatic initialization, the client would have to be able to access the singleton value in order to construct the class (is "construct" the right word here?). So in addition to having the name of the class visible, the client would also need the singleton path visible as well.

That said, having thought about it a little bit more, I don't think it's particularly interesting. I guess I'm thinking of using the inhabitant of a singleton type as a sort of unforgeable capability, but the problem seems to be that, in order to track this statically, you can't ever pass the value into a context where the path wasn't already visible (or at least, you can't track it as a singleton in any such context), so I don't think you can really do anything interesting with it.

Anyway it was a half-baked idea... Thanks for your response!

By Matt Hellige at Wed, 2007-11-07 05:19 | login or register to post comments

Parameterized Modules & Type Classes

Just off the top of my head, this seems much like defining type classes within parameterized modules. The big difference -- and this seems to be the real, fundamental difference between OO and FP -- is access control.

By Jason Dusek at Wed, 2007-11-07 01:24 | login or register to post comments

I'm not sure I understand

I'm not sure I understand the access control problem you're referring to. Can you elaborate?

By naasking at Wed, 2007-11-07 03:02 | login or register to post comments

Keywords

Object oriented programming languages generally allow one to specify which other objects may access certain members -- as opposed to merely exporting them or not, as most module systems do.

When we consider that some OO systems -- for example, Smalltalk -- seek to provide an 'environment', we can see why permissions arise. There may be other reasons, too; but I can't say I've seen much use for anything more sophisticated than "don't use _foo things, they're private!".

It's true that some OO languages -- for example Python -- have no access control at all; but I think the award for most sophisticated access control must go to some OO language. C++ is likely at the head of the pack, with not only public, private and protected but also friends!

By Jason Dusek at Wed, 2007-11-07 04:35 | login or register to post comments

But it's a rather shallow

But it's a rather shallow feature. The most common way to leak privacy is to declare a member variable private and define an accessor function - the inevitable GetFoo, GetBar, ... functions. Many programmers use accessor declarations mostly for documentation purposes: it is nice to see which part of the interface is dedicated for public use. One can use name conventions for the same purposes.

By Kay Schluehr at Wed, 2007-11-07 08:22 | login or register to post comments

slow structs

But it's a rather shallow feature.

I am not arguing for its greatness -- just its existence. It's

The most common way to leak privacy...

Indeed. How many class definitions are merely slow structs?

By Jason Dusek at Thu, 2007-11-08 08:13 | login or register to post comments

slow structs?

Other than default access type and inheritance mode, there is no difference in c++ between "class" and "struct". Simply nothing that would impact performance.

By Scott Johnson at Fri, 2007-11-09 17:07 | login or register to post comments

Direct getters/setters

I think Jason is referring to the fact that some classes (in the OO sense, not the C++ sense) are nothing more than a bunch private fields with direct getters/setters for each of them. This makes them kind of like C-style structs, except that field accesses are though method calls (slower).

One advantage to JITed languages is optimization across abstraction boundaries. Direct getters/setters can be inlined and optimized away.

By Kannan Goundan at Mon, 2007-11-12 22:09 | login or register to post comments

Getting JITery

I don't think "JITed languages" makes any sense[1]. Nor is this something that can't be done with static compilation - functional programmers often expect the compiler to eliminate much of the cost of abstraction.

[1] Yeah yeah, single implementation languages and all that

By Philippa Cowderoy at Mon, 2007-11-12 22:16 | login or register to post comments

So what can't be done with static compilation?

Maybe a "JITed language" makes performance guarantees about optimizing away even some abstractions composed at runtime. (I'm thinking of static guarantees like Scheme promising to eliminate tail calls, or Active Libraries and Universal Languages)

By Brandon Moore at Tue, 2007-11-13 07:16 | login or register to post comments

The Other DLL Hell

The one that really blows it every time is dynamically loading code - you can't do a true whole-program analysis if you can't see the whole program.

By Philippa Cowderoy at Tue, 2007-11-13 11:55 | login or register to post comments

I've been thinking about

I've been thinking about this too. It seems like it should be possible to partially evaluate the program once the code is loaded and all symbols resolved. Does anyone know of any such work?

By naasking at Tue, 2007-11-13 16:12 | login or register to post comments

Encapsulation is the only

Encapsulation is the only access control you need (essentially, capability security). Access to private, protected, friend, etc. state can be achieved via shared encapsulated objects. I've always found access modifiers to be quite a pain, and their usefulness seems dubious when inheritance is eliminated; I'm not even sure that they are a software engineering win in that arena. Considering I'm not a fan of inheritance either, I suppose that would explain my dislike for any such complicated measures to make inheritance "safer".

By naasking at Wed, 2007-11-07 16:34 | login or register to post comments

There are other more

There are other more substantial differences. For example the difference between subtype (family) polymorphism and type class polymorphism. Subtype polymorphism is more flexible in that the type system does not need to keep track of the exact types of all objects. For example, you can build a hetereogeneous list of objects of different types, iterate over them, and call late-bound methods. You can have a little bit of that using existential types in Haskell, but the existential type mechanism breaks down once you use multi-parameter type classes.

By Klaus Ostermann at Wed, 2007-11-07 09:16 | login or register to post comments

Food For Thought

The famous fruit basket. In goes an apple, in goes a pear, and out comes -- a fruit!

There is something 'dynamic' about this. Can a static language have late bound methods?

By Jason Dusek at Thu, 2007-11-08 08:09 | login or register to post comments

Late binding

...is just poor man's first-class functions, and as such has nothing really to do with typing.

By Andreas Rossberg at Thu, 2007-11-08 12:43 | login or register to post comments

one derivation

One derivation of the encoding is in TAPL, if I recall correctly.

By lmeyerov at Thu, 2007-11-08 20:13 | login or register to post comments

Sometimes it's useful to have a basket of fruit

as opposed to a basket of just apples or pears.

Of course, to meaningfully have a basket of fruit, there are a couple concessions to consider:

* You need some way of downcasting--of determining whether or not a particular fruit is in fact an apple or a pear. This, as a general rule, means no type erasure. To encode such a thing in a fully static language, simply carry around a bundle of HOFs which represent the "type" of the fruit in question (to address the comment above that dynamic dispatch is a "poor mans higher-order function". That may be true in some sense, but it's quite useful syntactic sugar.

* It complicates, or eliminates, type inference. If an array's type is only initialized by its literals; the compiler can't infer if you want a fruit basket or a basket of only apples--if all you put in at first are apples. This is especially a problem with mutable containers, where the initial state may contain only apples for some reason, but you want a pear.

By Scott Johnson at Fri, 2007-11-09 17:16 | login or register to post comments

You can have a little bit of

You can have a little bit of that using existential types in Haskell, but the existential type mechanism breaks down once you use multi-parameter type classes.

I think that's a little unfair given that OO just doesn't have an equivalent of MPTCs. I'd also like to ask what you have in mind by 'breaks down' - you can't put two types together and pull out an instance of a 2 parameter class at runtime, but you shouldn't be expecting that and it's perfectly possible to quantify over more than one type and add the context as usual.

By Philippa Cowderoy at Thu, 2007-11-08 12:00 | login or register to post comments

Polymorphism

You can have a little bit of that using existential types in Haskell, but the existential type mechanism breaks down once you use multi-parameter type classes.

Can you give an example of something you can express with subtyping but principally not with universal+existential polymorphism? I'm not aware of any.

There certainly are many examples for the opposite direction, so I believe your claim about subtyping being more flexible than parametric polymorphism is not quite correct (assuming "flexible" = "expressive").

By Andreas Rossberg at Thu, 2007-11-08 12:51 | login or register to post comments

Example

You can most of the time express everything with everything else. The more interesting question in my opinion is how elaborate or compositional the encoding is.

If you want an interesting example, look at this code (it is from this paper):

public boolean intersectAny(Shape[] shapes) {
  for ( int i = 0; i < shapes.length; i++) {
    for ( int j = i+1; j < shapes.length; j++) { 
      if (shapes[ i ]. intersect (shapes[ j ])) { return true ; }
    } 
  }
  return false ;
}

The example is in a multi-dispatch OO language. That is the implementation of the "intersect" method that is choosen in each comparison depends on the dynamic types of both arguments, e.g., there is a special intersect method to compare circles with triangles. The example can easily be simulated with dependent classes (most of the typing machinery is not needed for it). Furthermore assume that you have a heterogenous list of different shapes, whose structure cannot be predicted by the type system. For example, imagine a GUI where the user can assemble the list while the program runs.

I'd be curious to see a good Haskell encoding of the example.

By Klaus Ostermann at Fri, 2007-11-09 10:58 | login or register to post comments

Access control is nothing new

By the time you've specced out an inheritance mechanism (super's 'protected', right?) the majority of OO access control mechanisms can be encoded with let and records, as in:

let {-- all fields
;
} in Record {-- all public fields
;
}

Protected's tricky to get right because we let anything that claims to inherit from a class access protected members, but if we're allowed a mutable class registry then you just add a slightly bigger record to it in the same let block and pass that into anything that inherits from the class.

Even C++'s friend mechanism can be achieved with sufficient nesting and a little tupling to get everything overlapping right.

By Philippa Cowderoy at Thu, 2007-11-08 12:22 | login or register to post comments

C++ template classes

C++ template classes seem to be quite similar to that feature and they've been around for more than ten years now.
So I don't see how this is anything new.

By Mathias Gaunard at Sun, 2007-11-25 05:01 | login or register to post comments

Even without reading the

Even without reading the paper I can tell you that the formal results are significant. I imagine others can chip in with other relevant details.

By Philippa Cowderoy at Sun, 2007-11-25 05:26 | login or register to post comments

Lambda the Ultimate

User login

Navigation