How do Java generics correspond to System F-(omega)?

In his recent guest lecture in my undergrad PL course, Markus Mottl mentioned that ML-style module systems help his company develop and maintain software in the large. This guest lecture made me wonder how an ML-style module system could be expressed using Java generics. Because I've already translated an ML-style module system into System F-omega, could someone please point me at a comparison or translation between Java (generics) and System F(-omega)?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Probably not what you're looking for...

...but NeelK gave an explanation of ML Functors in comparison to pre-generics Java. Maybe NeelK can enlighten us on whether the generics give new capabilities wrt to limiting the combinations?

Java's generics are loosely

Java's generics are loosely based on F-sub -- System F with bounded quantification. I say loosely, because typechecking full F-sub is undecidable and I don't know the precise restrictions the Java designers put in place to recover decidable typechecking.

You can model bounded quantification by introducing power kinds. If A : type, then Power(A) : kind, and is used to represent the kind inhabited by every subtype of A. Then the bounded quantification forall A < B. phi(A) can be represented using forall A:Power(B). phi(A).

There's an old paper by Luca Cardelli about this, and also Chapter 31 of TAPL discusses F-omega-sub; maybe you can cut it down to see if you can get what you need.

Loosely, indeed -- but sure.

Some differences I can think of offhand are (1) Subtyping rules in F-sub
are structural, whereas all subtype relationships in Java must be declared, and (2) Java allows F-bounded quantification, e.g.

interface Comparable<T> { int compareTo( T x ); }
class BinarySearchTree<E extends Comparable<E>> { 
     ...
     ... e1.compareTo(e2) ...
     ...
}

which I'm pretty sure F-sub does not. (I'm not sure what happens under the encoding of bounded quantification using intersection types.)

On the other hand, Java places some weird restrictions on where instances of generic types can occur, because implementation by erasure makes certain necessary run-time checks impossible -- in particular, one cannot have an array of them (List<T>[]).

Incidentally, Java's "wildcard types" allow for a twisted kind of existential quantification: List<? extends C> means exists a < C . List<a>.

So it's accurate to say that it's loosely based on F-sub, but they're far enough apart that metatheoretic differences (like decidability) should not surprise us.

JCV

I ignored wildcards because

I ignored wildcards because they're weird enough that I tend to pretend they don't exist, but I genuinely didn't know that Java allowed F-bounded quantification! Thanks for telling me that.

Weird restrictions are due to arrays

On the other hand, Java places some weird restrictions on where instances of generic types can occur

They pretty much have to do with keep Java's slightly broken array type system from leaking into the well-sealed generic type system. Aside from restrictions on arrays, I don't know of any other "weird constraints."

Many of the complaints about Java's generics are actually complaints about its array system, which is indeed broken! This is one of several reasons why discerning Java programmers avoid arrays in most cases.

I'm not sure about that...

They pretty much have to do with keep Java's slightly broken array type system from leaking into the well-sealed generic type system. Aside from restrictions on arrays, I don't know of any other "weird constraints."

I have two minor arguments with this. First, while Java's array type system may be broken, I think it's broken in a way that's symptomatic of some general issues with generics: no proper accounting for co/contra-variance. This shows up in non-array contexts in the fact, for example, that A being a sub-type of B does not imply that List<A> is a sub-type of List<B>. A more sophisticated account of variance could probably allow this as well as dealing properly with co- and contra-variant array access (reads and writes), or at least this was how it seemed to me last time I looked seriously at Java generics, which was quite awhile ago now.

My second complaint is that the main "weird constraint" on generics usually cited has to with erasure and its implications, particularly the inability to construct new instances of generic types. I can't see that this has anything to with arrays...

It isn't and it shouldn't.

A being a sub-type of B does not imply that List<A> is a sub-type of List<B>, because a List<A> is not capable of storing B's - a valid operation for List<B>'s.

Arrays work differently and, accordingly, are not type-safe: every store can cause an exception.

I'm still (easily) confused

What is the angle from which one claims List<A> should not be capable of storing List<B>? Or do you think it should be able to but Java fails in that regard? Is there some nice theoretical underpinning somewhere?

(It just seems to me that one could argue either way, from a "common sense" or simple un-PLT-nuanced in-the-trenches programmer point of view.)

Consider a language with a

Consider a language with a Top type - Java's not far off having one, though Object's not quite the same. If we allow List<A> to store B, where A <: B, then there could be absolutely anything in that list because anything can be considered an instance of Top and thus stored in it. At which point, you may as well just have a heterogenous List class.

Transitivities of evil.

OK, right, that makes sense enough to me, I think.

[Presumably this is old news and wouldn't interest many folks who have been there, understood that.]

(I'd like to redefine the problem a little bit so I'm less confused - I think A and B should be swapped around so that A is the root and B is the subclass, that will make things easier for my mind. :-)

This brings up something basic and yet still interesting/confusing to me at times: What do we all mean by subtyping and what do our languages mean? I know, I know, there's stuff like LSP (OK, that's a really slow anchor to load). To me, when you subclass something you cannot be removing functionality. That would be breaking any sane kind of contract. (Unfortunately, the Java collections stuff does that all over the place and just throws runtime exceptions, freaking genius.) So if B <:A I would expect that List<B> wouldn't store As because As are insufficient.

However, if C <: B maybe it would be OK to store C in List<B> as long as you could only access B's interfaces on the entries.

But this is how it works in

But that is how it works in Java:

  • a List<Number>, for instance, can hold Integer's, Double's, etc - any Number;
  • a List<Integer>, on the other hand, can't take just any Number - only Integer's.

This means that neither is List<Number> a sub-type of List<Integer> (as it may have Double's in it), nor is List<Integer> is a sub-type of List<Number> (since you can't store Double's in it).

Clearer?

Clearer?!

Never clearer to me, I'm easily re-befuddled.

(Also, the notation is killing me when it comes to entering posts, so I'm switching to suqiggly brackets instead of gt lt glyphs.)

neither is List{Number} a sub-type of List{Integer} - that's OK because I think Integer <: Number. Having Doubles in List{Number} it is fine; you should only be able to access the Number API when you get something out of the list, short of doing evil type casting. Also, I would never expect Double <: Integer or Integer <: Double because the values they can take on are so different. At best you can say they share some operations, but once you get into the values it gets too weird. When you say that List{Integer} can't take any Number that's fine. But it should be able to take Integer and any subclass thereof.

Summary: if C <: B <: A, then List{A} can contain A, B, or C but only with the API for A; List{B} can contain B or C but only with the API for B, and List{C} can only contain C. To me, that is one core thing polymorphism is all about. (If you define the subtyping relationship but the relationship is broken vis a vie Integer and Double then all bets are off.)

I'm not sure how that works back to the original postings, I'll go back and re-read :-)

I agree...

When you say that List<Integer> can't take any Number that's fine. But it should be able to take Integer and any subclass thereof.

Just keep in mind I was talking about Java, where Integer is final.

A List<Integer> can only take Integer's simply because there can be no sub-classes of Integer.

No, no, no, no, no.

Suppose we take your suggestion, and allow List<Integer> to be a subtype of List<Number>. Then this will compile, but will encounter a runtime type error (which Java's type system promises it shouldn't, because there is no explicit type cast):

void addFloat(List<Number> list)
    { list.add(37.62); } // OK, because 37.62 is a Float, which is a subtype of Number

...
List<Integer> ints = new ArrayList<Integer>();
addFloat(ints); // according to you, this should compile!
Integer first = ints.get(0); // type mismatch

The problem is for subtypes of List, add() needs to be covariant and get() needs to be contravariant. You can't have both. Java's generics are correct, and you are not.

Java's arrays are, in fact, broken in more or less the way described above.

Covariant, contraviant, blah

Sorry, I had that backward -- parameter types need to be contravariant (as with add()) and return types (as with get()) need to be covariant for subtyping to work. But the point still stands.

Re: list.add(37.62)

We might be missing each other's point. There is no "add" method on Number. Citing myself:

Summary: if C <: B <: A, then List{A} can contain A, B, or C but only with the API for A; List{B} can contain B or C but only with the API for B, and List{C} can only contain C. To me, that is one core thing polymorphism is all about. (If you define the subtyping relationship but the relationship is broken vis a vie Integer and Double then all bets are off.)

Some quickly googled slides

This is a short set of slides (in PDF format) about subtyping. The interesting sections are the ones on function and reference variance.

Thanks

Ever more for me to read and grok, but obviously this is pretty fundamental and hopefully simple-ish stuff - although when I start reading about these things I find plenty of interesting twists...

Is a Cow an Animal?

You might also like this brief discussion of Is a Cow an Animal?.

Funny you should mention that

Since I've been reading and re-reading (hoping that it will all magically sink in at some point) the oft-mentioned On the (un)reality of virtual types which includes the animal/food example, but claims to separate the concerns such that you don't have a problem any more. I think.

(Addendum: I feel ever again like I should learn and love O'Caml, although I've been frustrated with it in the past.)

Don't Hesitate

raould: Addendum: I feel ever again like I should learn and love O'Caml, although I've been frustrated with it in the past.

Me too! I'm still learning this language that I claim to love. Please don't hesitate to ask questions. If you don't think they're appropriate for the group, I can provide my e-mail address.

What does sub-type do for you?

I think a question begged is: what is the point of sub-typing? You talk in particular about List{Subclass} being able to store instances of Superclass. But that doesn't seem like the correct issue to me. The issue would be: if you can use Subclass anywhere that you can use Superclass, then you should be able to use List{Subclass} anywhere that you can use List{Superclass}.

Now, I don't know what Java does, but it would make sense to me to let people pass List{Subclass} into a method that takes List{Superclass} (modulo evil type casting that somebody might then do inside the method, but obviously that is their own dumb fault).

But it doesn't quite work if

But it doesn't quite work if the list's mutable, because the method might want to put a Superclass into that list. And you can't pass in a List{Superclass} when a List{Subclass} is expected because the code won't be expecting to pull a Superclass out of the list.

Ah, right you are!

Dogged. So that's another aspect of looking at these things: consuming vs. producing. Are there systems that are very different in their typing because they do/not allow mutability, or that vary the typing based on consuming vs. producing?

Sort of. The easiest thing

Sort of. The easiest thing is to start off with the concept of "variance" (roughly speaking, "can I accept a subtype in place of a type? Can I accept a supertype in place of a type?"), assign appropriate variances to mutable values and immutable ones and propagate everything according to some simple rules from there on - an ML-style system where references to mutable cells exist but all actual values are immutable works well.

Mutability...

As was already said, this breaks with mutability.

Going back to Java, I'd like to add that, when you don't need mutability, you can pass your List<Subclass> to a method that takes a List<Superclass> by using the Collections.unmodifiableList() method to wrap your list inside an unmodifiable list of the apropriate super-type (any attempt to mutate, will originate a run-time error, not compile-time - maybe someday).

What wildcards are for?

(any attempt to mutate, will originate a run-time error, not compile-time - maybe someday).

I don't understand this comment. Java's generics supports variance. It supports them via a "wildcard" syntax.

For your example, rather than typing the method such that it receives List<SomeClass>, you type it so that it receives List<? extends SomeClass>. The type-safety is enforced at compile time.

Angelika Langer's FAQ is a good place to go, to get a solid understanding of Java's generics facilities.

Java's generics supports

Java's generics supports variance. It supports them via a "wildcard" syntax.

Yes, I know. And if you look at the type signature of the Collections.unmodifiableList() method that I suggested using, you'll see it does use wildcards.

For your example, rather than typing the method such that it receives List<SomeClass>, you type it so that it receives List<? extends SomeClass>.

Assume for a moment I cannot (or do not want to, for whatever reason) change the method's typing - after all we are not always in control of all the APIs we are using. What are my other options?

Of course, if you are in control, then you should use variance whenever it makes sense (unfortunately that will not, I imagine, be the case for most code I'll see).

The type-safety is enforced at compile time.

Not if you're forced into using the method I suggested, because, as I also stated, attempts to modify the resulting list will throw an UnsupportedOperationException. This can certainly be seen as a "type error", since it originates from trying to modify an "UnmodifiableList", for which we could (and maybe should) have a separate type.