J2SE 5.0 released

Sun announced the release of the Java 2 Platform Standard Edition 5.0.

Most of the new features that are interesting from a language design perspective were discussed here in the past. These include generics, autoboxing/unboxing, metadata and typesafe enums. We also discussed some of the new libraries and APIs.

I can understand how Gosling is feeling right now: I wouldn't feel comfortable being responsible for a language without generics...

What's next for Java? The new release model is explained here. The general design philosophy here.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Platform or Language?

From Hamilton's blog cited above:

One last thought: the Java language is only one language for the Java Platform. It has one particular set of values, which seem to be particularly good for creating large, maintainable programs. But different developers want different things at different times. It's OK to have multiple languages, each of which is well designed for a specific kind of programming. When I'm doing a quick Sunday morning hack, sometimes I want something more scripty, and I don't want to have to track all those checked exceptions. So I'd like to see more use of alternate languages on top of the Java platform, with different goals and styles. For example, I was really pleased to see the "Groovy" language brought into the JCP. It is designed as a new Java platform language, but with its own goals and design center that are very different from the classic Java language. Happiness!

Large, maintainable programs

It has one particular set of values, which seem to be particularly good for creating large, maintainable programs

This is a claim you often hear about Java, and it mystifies me slightly. I don't mean that I think Java is no good for creating large, maintainable programs (I wouldn't know, having only written small-to-medium-sized and somewhat crufty applications in other languages); just that I'm not sure what it is about Java that's supposed to make it better at this than other platforms/languages. What are the particular Java features that support scaleability and maintainability? Shouldn't every language have at least the option of using similar features?

In relation to...

I guess Java is scalable and maintenable in relation to previous mainstream alternatives, like C, C++ (its main fault being lack of automatic memory management) and others.

State your proof, please

C/C++ are used in non-trivial software systems such as Windows, Linux, and a few others. Where has Java shown itself to be more scalable?

Should have added

I intended to say "I guess the author believes that Java is scalable and maintenable in relation..." I do believe Java programs tend to be easier to maintain than C programs, because of automatic memory management if for no other reason. No, I don't have proof for that, only my experience with both languages; therefore, it's only my opinion, and I don't claim it's the truth.

Mind you, I won't touch Java unless I really have too; the same goes for C and C++, except that I have some fondness towards C because it was my main language for ages. Please, don't interpret my comment as a "my language is better than yours".

Scalability

I'm certainly not into "my language is better than yours" arguments. In fact, I'm not disputing the fact that managed programs are more maintainable (I've tracked my share of memory leaks in other people's programs ;-). It's the issue of scalability that I'm questioning -- in my experience, one can't claim to be scalable until one tries, so I was wondering how many examples we have of multi-million LOC programs in Java to compare with.

Java scalability

I've worked with Java on several multi-million line (serious) projects. The types of projects range across the board from financial to logistics to game servers. The project I'm working on right now is swiftly approaching 1 million lines of code, and it will likely pass another million before it goes GA.

I would say that Java has several features that aid in scalability.

- Automatic memory management. Not only does this simplify the code-base, but it extremely simplifies the complexity of interfaces.

- Builtin language support that encourages people to design by contract (i.e. interfaces).

- Reflection. The ease of use in which you can dynamically load and use an arbitrary implementation of a library is extremely important. Dynamic class loading goes hand in hand with separating interface from implementation.

- Static type safety. This will bring on the holy wars, but in my own experience, static types are very important to scaling a project across many users.

- Builtin documentation ala Javadoc.

- A safe virtual machine to run under. People can't do stupid things to you like #define private public or overwrite your object's memory space. If someone does something stupid, you're guaranteed to get a meaningful stacktrace back right into their code.

- A simple language and bytecode format that has encouraged the proliferation of all sorts of code generators and bytecode generating and manipulating tools.

I wouldn't say that all of these things are solely germane to Java, but compared to all of the other mainstream languages (and a lot of non-mainstream languages too), it's my opinion that Java has the best combined feature set for scalability.

Java GC vs scalability

Automatic memory management. Not only does this simplify the code-base, but it extremely simplifies the complexity of interfaces.

A few years ago I worked on a project to integrate a new protocol stack with a large enterprise application (written primarily in C). Scalability was a chief concern. We evaluated a number of third party stacks, mostly written in C, along with one written in Java. The initial Java version performed very badly. Stress-tests which barely stretched the C stacks caused Java to fall over (the C stacks could cope with around 200 simultaneous requests; Java barely managed 10). The problem was tracked down to an overuse of objects (a quite staggering number were created for a single session) and a reliance on Java's garbage collection. The vendor rewrote the stack to avoid GC by doing its own memory management with an object pool. They also moved to a more procedural style from the OO code they had been using (the design looked great on paper, until you counted the number of objects). These changes greatly improved the memory footprint of the code (although, the C versions were still much better).

I later heard from experienced Java programmers that manual memory management was the only way to go if you wanted any kind of scalability out of Java. Furthermore, I was given the impression that Java's memory model made it difficult to write an efficient GC. This was about 3 years ago though, so I would be interested if things have changed since.

(Note: this shouldn't be construed as an argument against GC in general - I understand OCaml's GC is extremely efficient. It's merely some anecdotal evidence about a handful of implementations of Java's brand of GC.)

GC doesn't program for you

The problem was tracked down to an overuse of objects (a quite staggering number were created for a single session)

Just from considering the story you relate, I would be inclined to look for a solution in the quality of the program rather than drop the failure in Java's lap or in that of its GC.

There ain't no such thing as a free lunch: GC is a great tool for simplifying memory management, but it doesn't mean you can stop thinking about how much memory you use or how you use it.

As with so many other programming problems, you still have to know what you are doing (it helps to use a profiler), and small changes to your algorithm and data structures can often yield significant improvements in memory and time performance.

The only thing about the Java language or community that might have contributed to this is that there were/are still lots of people blinded by the "magic" of platform that haven't learned that there is still a lot of work by the magician required to pull the trick off.

Quality of program

Just from considering the story you relate, I would be inclined to look for a solution in the quality of the program rather than drop the failure in Java's lap or in that of its GC.

Indeed. There were clearly some quite serious problems with the design of the code. This wasn't entirely the fault of the initial programmers as the intended application of the stack was not the high-load that we required. However, the people involved with the profiling of the code came to the conclusion that even with many design improvements, a significant amount of time and resources were being spent in garbage collection cycles. Hence the decision to move to a pool allocator, which did result in measureable performance gains on the particular Java implementations tested.

I agree though that in general it would be a mistake to blame a poor design on Java or GC. It is however possible (and in this case, probable) to have both bad design and a bad Java implementation of garbage collection.

Are we talking about the same scalability?

Scalability was a chief concern

I think the OP (and myself) were referring to a different kind of scalability - scaling the size of the application itself.

To address your other point, yes, the GCs available for Java have changed greatly over the years. So much so, that the common wisdom is to avoid any object pooling unless the object construction is inherently expensive (e.g. network connection), because you'll spend more time in pooling code than in the allocation and gc. As an example of the relative sophistication of the current crop of collectors, take a look at this tuning documentation. There are also new gc policies in 1.5 that allow you to specify constraints like max gc pause.

Of course, a garbage collector will never be able to compensate for an incompetent developer. No language feature ever will.

Re: Large, maintainable programs

"just that I'm not sure what it is about Java that's supposed to make it better at this than other platforms/languages."
Well primarily I think it's a marketing claim, and Sun prolly wants to sell to a market that makes big programs.
With that said:
- Lack of inheritence diamonds
- Lack of template bloat
- Lack of compiletime blowout
- a large standard libriary, which avoids the problems of having to juggle multiple non-standard libraries.
Of course all these issues are things that C++ does not deal with well.

That's roughly what I thought

As far as the language goes, it mostly boils down to omission of "features" (e.g. multiple inheritance, operator overloading) rather than extra, special features (e.g. DBC, or a more powerful module system).

garbage collection is nice

I write small C++ programs for machine learning on large, natural-language corpora. I don't especially like the language, but memory management is not a difficult problem any more, thanks to some of the newer template libraries (STL, Boost) and tools like valgrind. That is, you'll really never want to use new/delete or malloc/free directly, but instead use objects which clean up after themselves on destruction (something that has to be done anyway to ensure exception safety without going insane with explicit try { } catch { } everywhere).

Since memory management policy objects are accepted by any sort of modern C++ ADT, you can use a mix of garbage collection, reference counting, stack regions, explicit ephemeral heaps ...

The C++ template mechanism is really something Java can't approach, without explicit multi-stage programming. C++ templates do have their disadvantages (they're unwieldy both to use and to process), but I'd miss them ... or at least I'd want something like MetaOcaml in their stead.

Memory management

That is, you'll really never want to use new/delete or malloc/free directly

The problem with that, is that you still have to deal with lifetime issues all over the place, (how long does this reference need to be good for) unless you're passing copies of all of your data around everywhere. By the time you start using mechanisms like boost::shared_ptr, you're just using a form of weak and slow garbage collection.

Basically what I'm saying is that, in my experience, C++ is great for writing very small (< 10KLOC), very well performing portions of code. The problem is that that same flexibility ends up being a nightmare when you need to build significantly larger systems. People usually try to combat the complexity with idioms that perform much worse under C++ (copying and ref-counting) than their counterparts in other languages (full gc in Java).

What I've found is that I am now primarily using C++ for very compute intensive pieces of code, for example, something that needs to take advantage of SIMD units on a machine. The rest of the codebase remains written in a more scalable language, like Java. (The current application I'm working on is about 1% C++ and 99% Java).