Tim Bray: Sun & Dynamic Java

Tim Bray,

So on Tuesday we held a summit here at Sun, with a few of our internal Java leaders, and on the dynamic-languages side, Larry Wall and Dan Sugalski (Perl and Parrot), Guido van Rossum, Samuele Pedroni and Sean McGrath (Python), and James Strachan (Groovy). It was an educational day for us; herewith some take-aways and pictures.

Everyone and his sister is linking to this blog post, so you might have seen it already.

For a touch of flavor, here’s just part of the list that of things to discuss that Larry had prepared: anonymous code/closures, lvalue methods/functions, variadic call/return, constant/rw/copy/ref/lazy parameters, wrappers/AOP, eval, exception handling/undef/nil/unthrown exceptions, exception handlers with lexical access, temporization/hypotheticality, efficient regex with complete semantics, access to dynamic context/want/caller, redispatch of methods (fallback upon "fail"), efficient switch statement?, versioned modules/classes, virtual classnames, continuations.

Feel free to express your opinion regarding this list, and other relevant factors.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

what is this "recursion" you speak of?

There wasn’t a mob howling for tail recursion...

Grrrr....

FP?

Noone from the FP community there, eh?

Expertise?

Dave Herman:

There wasn't a mob howling for tail recursion...
Grrr....

Dave is, of course, right; the translation is that there weren't enough functional programmers present.

Really, to me, the question isn't whether there are lots of other languages, dynamic or otherwise, that run in the Java ecosystem. The question is whether there are lots of other paradigms besides the object-oriented and imperative ones available. The report makes clear that there were lots of OO and imperative guys there. That's a nice start, but it doesn't go nearly far enough in ensuring that the Java ecosystem evolves in constructive ways.

java.lang.StackOverflowError

I like Java, I really do, but they can do better than this.

Why do people think recursion is just an FP thing? The first recursive program I ever wrote was a binary search tree in C++. I didn't think data design was limited to academic navel-gazers. Doesn't OOP espouse, you know, code that follows the structure of the data?

Recursion and FP

Dave Herman: Why do people think recursion is just an FP thing?

As you point out, it's obviously not. But things that are typically done with mutation in OO and/or imperative languages aren't in FP, of course, and so, as you know, you end up with a lot of intermediate state represented on the stack, hence the concern about overflow.

It's funny how many of these issues start off being technical but end up being cultural, though: the FP emphasis on recursion ends up looking forced and "hard to follow" for most OO folks, in my experience, whereas the collection-of-state-some-of-which-is-implicit-via-inheritance of OO seems pretty whack to an FP person.

It was more a collective shrug

Well, except from Guido, though his feelings on recursion's pretty well documented. :)

But no, there weren't any functional language folks there. It was a dynamic languages meeting, so that's not much of a surprise. (For what it's worth I lobbied, albiet quite weakly, for the tail recursion thing, but there's a limit to my enthusiasm since I didn't really care that much either)

Lisp?

But no, there weren't any functional language folks there. It was a dynamic languages meeting, so that's not much of a surprise.

It is at least a little surprise that no Lisp people got invited when they were talking about dynamic languages.

Ah, right. I forgot. Lisp smells funny. :)

Tail recursion is not even a bytecode issue

Can somebody explain to me why there is such a pressure on the VM designers to put a tail-call bytecode in? To me it seems more apropriate to address the requirement to the VM implementors. After all, a tail-call is semantically equivalent to a normal call followed by return instruction. Any JIT compiler should be able to optimize the call-return sequence into a proper tail call, if only its implementors cared to do it.

All JVM needs right now is a proper implementation optimized for execution of typical code generated by non-Java front-ends. Some code, like call-return sequence, could be optimized away with no change in the bytecodes whatsoever. For other features it would be nice to get more "hints" from the compiler, but I believe the bytecode file headers already contain an area where compiler can store arbitrary additional information about the code.

guarantee

The importance of proper tail recursion1 is that, without it, certain programs (see part III), such as programs in CPS, or implementations of mutually tail-calling finite automata, are guaranteed to fail. Because of this, proper tail recursion is not simply an optimization, but a requirement for the correctness of a language implementation2. Furthermore, programmers are forced to use a language's built-in loop mechanisms, and can't invent their own.

Now, IANA compiler expert or VM designer, but with a tail-call bytecode, a compiler writer can ensure that tail calls in the source language -- in whatever form they take in that language -- will become tail calls in the bytecode. Relying on the VM to compile call-return sequences to tail calls would force the compiler writer to ensure that tail calls in the source language did get compiled correctly to sequences that the VM would further compile to tail calls. This is pretty much tantamount to having a tail-call bytecode in the first place. And having the bytecode there makes it that much more explicit; it's easier to see where the VM is guaranteeing that a call will be a tail call.

It comes down to recognizing that proper tail recursion is more than just an optimization, but actually a correctness property.


1 A terrible name, since of course it's not about recursion but tail calls. Tail-call optimization is not a good name either, though, since as I'm explaining, it's not an optimization but, e.g. in the case of Scheme, a requirement.

2 Naturally, it's only a correctness requirement if your language guarantees such programs to succeed. But these are useful programs; they're about writing code that follows the data structures naturally, and about implementing different kinds of loops to fit different kinds of data structures. These are good things in OOP and FP alike.

Re: guarantee

The importance of proper tail recursion1 is that, without it, certain programs (see part III), such as programs in CPS, or implementations of mutually tail-calling finite automata, are guaranteed to fail.
I see your point, but I have a feeling this is an overstatement. Mutually tail-calling functions will fail on a VM that doesn't optimize tail calls only if they overflow the call stack. A tail-call-optimizing VM is equivalent to a VM with infinite call stack depth. The problem with the cases you mentioned is that the depth of recursive calls growsd linearly with the size of input, or even faster. But that doesn't mean a program can't run for any input.

I suppose you could compare a VM without tail call optimization to a VM that doesn't perform any garbage collection. They both work for a large class of programs, and for others they work only when inputs are small. Notice that the Java specification, AFAIK, contains no guarantees that the GC will be run at any time before the end of the world. I'm not sure it even guarantees that the GC will collect all unreachable references when it runs.

Now, a VM that doesn't collect garbage is perfectly fine for many uses, and is in fact superior for hard real-time applications. But if your application or language translating to JVM needs garbage collection, you better choose an implementation that does collect garbage. Why would tail calls be any different?

Now, IANA compiler expert or VM designer, but with a tail-call bytecode, a compiler writer can ensure that tail calls in the source language -- in whatever form they take in that language -- will become tail calls in the bytecode.
I am, sort of. Without any tail-call bytecode, a compiler writer would optimize every call followed by a return into a tail-call, provided that the call is not nested in a "try ... always" block or something equivalent. This is not hard nor expensive to test. Furthermore, a JIT back-end writer would be able to optimize calls that the compiler front-end did not mark as tail calls. It could even magically rearrange the code so that a "normal" call becomes a tail call. This last case may be pushing it, but Sun's HotSpot is already doing stuff like that.

how soon you hit the wall

But that doesn't mean a program can't run for any input.

Right, but this limit is not an academic issue -- there are lots of programs that you can only write recursively with proper tail recursion. Java's stack depth is typically pretty small. (This is just as big an issue for recursive functions that aren't tail calls, since there's a fixed stack size. My friend describes the error as "I have this big heap. Stack overflow.") And no matter how big you up the stack size, it'll never be able to handle infinite tail-recursive functions like event handlers, actor-style programs, etc.

Now, a VM that doesn't collect garbage is perfectly fine for many uses, and is in fact superior for hard real-time applications. But if your application or language translating to JVM needs garbage collection, you better choose an implementation that does collect garbage. Why would tail calls be any different?

Yes, this is a good point. Applications with very different needs require very different platforms. That's why the Java real-time system is spec'ed and implemented as a completely different platform from J2SE (or J5SE, or whatever it's called now). But if your program depends on proper tail calls in order not to blow up, you need your platform to guarantee them.

Without any tail-call bytecode, a compiler writer would optimize every call followed by a return into a tail-call...

I understand. What I'm saying is that this is hardly any different from having a tail-call bytecode. If your compiler needs to guarantee proper tail calls, it will have to make sure to put them in a form that the VM will JIT correctly, and the VM has to guarantee that it will in fact perform the tail calls in constant space. Whether these take the form of tail or call+return is essentially syntactic. But either way, the correctness of the compiler's implementation of proper tail recursion relies on the correctness of the VM's implementation of proper tail recursion. Current JVM's do not make these guarantees. They should.

GC guarantees

Notice that the Java specification, AFAIK, contains no guarantees that the GC will be run at any time before the end of the world.
Of course it does not - why run GC, if your program does not allocate heap memory, for example? What it does guarantee, is doing GC's best before throwing OOME:
OutOfMemoryError: The Java virtual machine implementation has run out of either virtual or physical memory, and the automatic storage manager was unable to reclaim enough memory to satisfy an object creation request.
I grant that this differs only marginally from what you said.

Also note that the spec explicitly allows Java stack to be not contigious:

Because the Java virtual machine stack is never manipulated directly except to push and pop frames, frames may be heap allocated. The memory for a Java virtual machine stack does not need to be contiguous.
From the statements that stacks are thread-private and the heap is shared between threads, it looks that the heap on which stack frames may be allocated is not the JVM's heap, but most probably C heap :-( Formal semantics anyone?

Nevertheless, with some minor revisions the spec can be brought to the state in which implementations are at least allowed to GC stack frames, and this is almost as good as tail-call optimization (argh, don't we have a better term for that?).

Java isn't the same as the JVM

Larry's kitchen-sink list references several short-comings of Java the language. But adding all of his list would result in what we have in Perl now (... I sense the arrows being loosed as I write...) - an unwieldy, non-intuitive language that gives me the same feel as when I have to deal with the Windows architecture. Any sense of unification and elegance is buried under a mound of bolted-on extensions. Java the Language has enough of that already.

I would feel more comfortable having a discussion around modifications to the JVM that would enable more languages to be easily ported to the JVM (especially functional languages - but that's my preference). Then we don't have to have the long and fruitless "discussions" about whose language is "better" and why aren't my marvelous features not defined in Java the language. At CofC, I have an on-going project to build a new version of CL to run directly on the JVM. Frankly, I do wish the JVM had more support for continuations. But we're figuring out how to bend the JVM to our needs nevertheless. And more interestingly, by thinking out an architecture that doesn't fight the JVM, it seems to be relatively simple to call back and forth between Lisp and Java (without requiring a foreign function interface). Perhaps Larry would be happier if someone just ported Perl to the JVM and provided a simple two-way calling mechanism. Then if I needed regexes in some version of Prolog written to the JVM, I'd call out to Perl.

So here's my challenge to the functional community. What minimal set of JVM changes is necessary to support implementation of functional languages compiled to the JVM?

Armed Bear

Is your project somehow related to ABCL?

relation to ABCL

No, CLforJava is an independent effort involving undergraduates at the College of Charleston in a "real world" software engineering project. Each semester, the project focuses on developing functionality in the different area of CL. For example, the primary efforts this past semester were to implement a full Character system that fully integrates Unicode 4.0 into CL and to rewrite our number system to conform to some architectural changes. The students get a chance to work with industrial tools (Perforce, netbeans, bugzilla, ANT, TWiki, JUnit, Moveable Type) and industrial processes. They get to review and understand a complex system (we're at 540 classes - much bigger than your typical undergrad project), write specs, write and document that code, write test cases, and fix bugs (theirs and others).

CLforJava differs a bit from ABCL in several ways.

  • There is a significant level of interoperability between Java and Lisp. The Lisp type system (and any dynamic types) are mirrored in the Java type system (interfaces).
  • For all instantiable types, there is a Factory pattern in Java to create new instances.
  • All Common Lisp functions are directly accessible to Java programmers.
  • (coming soon) The ability to directly call any Java method on any class as if the method were just a normal generic function.
  • Unicode 4.0 support. We are planning to submit a paper to the ILC 2005 gathering that proposes a compatible way to fold Unicode 4.0 into the current CL ANSI spec.

The current projected 1.0 release (under the GPL2) will be around 2009. I intend to have 0.n releases sooner, but no promises.

I apologize that we don't have a public web site yet. We have some other student groups competing to create our site. It should be up early in the new year. I'll post its availability here. If you would like to see some of the work, email me, and I'll give you access to the TWiki site we have for the project. However, I won't be able to respond until early Jan. Thanks for the interest.

Non-Java languages on JVM

Besides the mentioned tail call optimization, I would have at least these problems with porting my dynamically typed impurely functional language Kogut into JVM (BTW, .NET would have the same problems, except perhaps tail calls):

  • Small integers are not statically distinguished from other objects, and thus they would have to be heap-allocated. JVM doesn't allow to use the common trick of tagging small integers in the lowest bit or two bits, distinguishing them from pointers.
  • A program can try to apply an arbitrary object to a sequence of arguments; many objects like numbers and strings always respond with an exception, but e.g. records can be applied to field names. This leads to two choices to implement this, both not very efficient: have an interface implemented by all my objects (problem: all Java objects must be wrapped, e.g. a string can't be used as a string directly) or have an interface implemented only by some of the objects, and downcast for application (problem: lost of downcasting).
  • Each different function code would yield a separate class, including anonymous functions. AFAIK classes in Java are quite heavy, they are put in separate files. This requires many small files. Functional style uses anonymous functions a lot.
  • What is the memory overhead of a Java object consisting of fields, relative to fields themselves? My current implementation adds 1 word. If it's 2 words, it's still OK but it's a pity. If it's more, it becomes bad. Functional style leads to lots of small objects.
  • Strings use Unicode code points, not UTF-16 units. This means either to have two string representations or not use native strings at all.
  • I prefer threads to asynchronous operations or separate APIs with timeouts. My current implementation provides lightweight green threads. Java threads are probably much heavier. The programming style which uses many threads would not be feasible, and maybe I would have to complicate the API to provide variants which don't need to use threads so often. Like with deep recursion, it should be the language's responsibility to be scalable, the program should not be forced to use an awkward style by avoiding large stacks or lots of threads.
  • Asynchronous communication between threads with temporary blocking of asynchronous signals will probably be hard to map. Unfortunately I don't have a specification written in one place yet (and the design might still change a bit), and going into details would take too much space here. I also expect problems with weak references and finalizers.

Hmm

Small integers are not statically distinguished from
> other objects, and thus they would have to be
> heap-allocated. JVM doesn't allow to use the common
> trick of tagging small integers in the lowest bit or
> two bits, distinguishing them from pointers.
JVM has distinct unboxed and boxed integer types (int vs. java.lang.Integer or something) [Together with facilities like overloading this might give you the desired effect, although more bloated.] Am I misunderstanding you?

AFAIK classes in Java are quite heavy, they are put
> in separate files. This requires many small files.
Just one file per public class, IIRC.

Dynamic typing

JVM doesn't provide a way to select at runtime whether a particular variable will hold an unboxed integer or a pointer.

While some implementations of dynamically typed languages don't use this (Python represents all values as pointers, including small integers), most of them do, because it avoids heap allocation in integer arithmetic (until the number no longer fits in a machine register minus a bit).

It can be implemented semi-portably in C, by shifting an integer a bit to the left and tagging the value in the least significant bit, to distinguish it from a pointer to a heap-allocated object which is assumed to be even (objects are aligned). Correctness of this is not guaranteed by the C standard but it works on all common architectures and is widely used.

> JVM doesn't provide a way t

JVM doesn't provide a way to select at runtime whether
> a particular variable will hold an unboxed integer or a
> pointer.
I think the main problem is that you can't detect overflows etc. in JVM, right?

How about C#? C# 'unsafe' functions can use pointers, IIRC... is it possible to cast between pointers and integers in 'unsafe' functions? In contrast to the JVM you can detect overflows etc., as well.

Big integers

> I think the main problem is that you can't detect overflows etc. in JVM, right?

This is a different thing. Detecting overflows is not easy in C too, yet can be done portably without too much overhead. A trick for addition of signed words: if z = x + y modulo word size, then there was an overflow if and only if (~(x^y) & (x^z)) < 0. This produces a single conditional jump on x86, along with a few instructions of fast arithmetic. An optimized assembler is only slightly better. Multiplication is harder though.

Handling overflow doesn't solve the problem of representing small integers without heap allocation.

If you can't optimize small integers anyway, perhaps the best way is to just use BigInteger for all integers. They are probably poorly optimized for the case of mostly small integers, and always use generic loops over digits in some large base. But since you can't make small integers fast, avoiding heap allocation and with optimal overflow handling, perhaps it's the best thing which can be done in Java. Java designers assumed that you will use types like int for values which should be small enough to fit. Unfortunately this is incompatible with dynamic typing.

> How about C#? C# 'unsafe' functions can use pointers, IIRC...
> is it possible to cast between pointers and integers in 'unsafe' functions?

No, .NET has the same problem as JVM here, and unsafe mode with pointers doesn't help at all.

.NET adds stack-allocated structures which in theory can be used to avoid heap-allocating small integers, but at the cost of doubling the size of all values (not objects, but variables, fields etc.) which means it would be much worse in overall.

> In contrast to the JVM you can detect overflows etc., as well.

Yes, if the implementation compiles 'try' with zero overhead in case the exception is not thrown (I believe they do, I'm not sure), then overflow detection can be made as cheap as theoretically possible. You still have to heap-allocate integers though.

Some range of integers can be preallocated, like Python does (this is a rare case of an implementation of a dynamically typed language which does heap-allocate all integers). This is probably worth the effort, but it's not as good as representing all integers fitting in a machine word minus one bit as bit pattterns which are different from pointers. You can't preallocate that many integers, and you must fetch their contents from memory to read their value.

Missing: Pattern Matching

Pattern Matching is not listed in "things to discuss". A pity, because I find it highly useful when expressing all kinds of transformational tasks. I would even wish for more powerful pattern matching operators, probably even going all the way to unification.

Well, the omission is hardly surprising, since in OO languages pattern matching does not have the same significance as in, say, Haskell or ML. In Java, there is instanceof to "match" on classes, but its use is frowned upon. Or I can use the builtin dispatch mechanism, but that scatters code across many methods. I would probably end up using some form of Visitor pattern for e.g. tree transformations.

Compared to your average FPL, OO-style "pattern matching" patterns are rather inexpressive and awkward. In other OO languages the situation is probably similar, with Scala as notable exception.

Compared to your average FP

Compared to your average FPL, OO-style "pattern matching" patterns are rather inexpressive and awkward.
Isn't this an issue with algebraic vs coalgebraic data rather than FP vs OO? But I agree that when writing in Java using algebraic data structures one misses some tools - like pattern matching (last time I did it I was implementing in Java a kind of type inference, it took 20 lines in Prolog, and tons of classes in Java).

An interesting analysis is given in OOP as an Enrichment of FP.

languag vs runtime issue

I think the list of features mixes language and runtime issues. It's not surprising given the perl implementation where basically all the language features have a corresponding opcode in the runtime to implement them. This makes sense in an interpreter where you want to avoid the dispatch overhead and have coarse opcodes.
But I don't think it would make sense for the JVM as it doesn't make sense for Mono or the CLR. Some examples follow.

Anonymous code and closures: this is a language feature, the JVM would need no changes to support them just as the CLR didn't need changes to add support for them in C#. Only the C# compiler needed changes.
What a langauge needs that could be relevant is the ability to garbage collect the code for methods built at runtime. I'm not sure if the JVM has already this capability: Mono and the CLR have some support for this (using separate appdomains or dynamic methods).

lvalue methods: this is again a language feature. In Perl you can write something like:
$obj->method () = $some_value;
But this is again a compiler feature: the metho returns
a reference (think: pointer) to the value that needs setting. Ok, the JVM doesn't have pointers, but this
can be implemented in other ways. lvalue methods can be simply considered as a setter method: the compiler doesn't need any specific support to implement it.

Some of the other requested features are not well described enough, at least to me, to see if it's a language or runtime feature. Of course some could be
implemented by the runtime, like access to the caller context, but why put the burden of it on all the languages that run on the runtime (because it has a runtime cost) if they don't need it? The perl compiler would use it's own way to pass the info, maybe as an extra argument.

Efficient regex is a library issue: dynamic code generation is all what's needed to get them.

Continuations are a feature that requires extensive (and expensive) runtime support, so it's something that should likely not be implemented in a general purpouse VM. It's also not a feature that many languages need (or they can implement it, though it will be slow).
It would be interesting if people who actually wrote compilers for languages that use continuations could propose a set of opcodes/runtime features that would make continuations more easily implementable in the JVM/Mono/CLR runtimes without slowing down the other languages that don't need them.

Possible vs. Easy

Anonymous code and closures: this is a language feature, the JVM would need no changes to support them . . .

It's not so much a matter of making it possible, but rather of making it easy. For instance, Parrot has a opcodes for keeping track of lexicals, which are of great use for closures and such.

Continuations are a feature that requires extensive (and expensive) runtime support . . .

Very true, though Parrot is trying anyway. (In fact, there has been some lengthy debate recently on the Parrot mailing list about how to restore registers in the face of continuations). A while back, Dan showed some interesting code about optimizing tail calls with continuations (though I can't seem to find it ATM).

Re: Possible vs. Easy

It's not so much a matter of making it possible, but rather of making it easy. For instance, Parrot has a opcodes for keeping track of lexicals, which are of great use for closures and such.

Well, the not-so-easy code is to determine which vars are needed in the closure and which not. If your compiler has that code and applies the special opcode to them, it's no different than changing the var to a field in the object representing the closure as it's implemented in Mono. If the opcode keeps track of all the vars, then it's just as easy to save all the vars in the JVM or in Mono/CLR (but, of course, just as wasteful).

As for using continuations to optimize tail calls: as someone else explained, the jit can do it, though it's useful to have a special opcode that forces the jit to do it. That said, I follow the parrot lists and it looks like continuations have been one of the sources of delays and implementation issues, with few or no users of them. I'd suggest to drop them and go on implementing perl and python on it first, but that's just me;-)

Already Using It

I suspect the reason that nobody uses continutions on Parrot yet is that it's currently broken :)

In any case, my understanding is that continuations are critical to implementing Perl6, Python, and Ruby (which are the target languages for Parrot), and also Scheme (which isn't a target language, but would be nice to have). Perl5 can probably get along with what's there now.

Still need standards

Even if you don't need explicit bytecodes, you need to have standards to make sure different compilers perform same transformations (for interoperability).