Sun ships "extensible" Java compiler?

According to the buzz on this forum thread, Java 5.0 (nee 1.5.0) ships with a tool called "apt" (Annotation Processing Tool) that enhances the traditional javac compiler with the ability to let the user write code that examines and transforms the AST.

I assume this tool will not let you extend the Java language in arbitrary ways: it seems like apt can only parse source code that is legal Java. However, it will let you create compile-time checks for some rules that previously could only be checked at runtime. It will also likely be useful for code generation; some heavily used Java projects rely on bytecode manipulation (JDO, JBoss, Tapestry) and I suspect they could simplify their lives a lot with this.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I wouldn't call apt an extensible compiler

the ability to let the user write code that examines and transforms the AST.

As I have already written in that thread, apt does not let you transform the compiler's AST. That would be the "killer" feature that Sun needs to add to apt to make it a very useful tool.

As it stands now, all that apt supports is reading the compiler's AST as source files are being compiled and aiding you in generating new source files to be compiled. If you wish to do any transformations whatsoever to existing code, you'll have to do it all outside of the compiler and apt.

It will also likely be useful for code generation; some heavily used Java projects rely on bytecode manipulation
Not as it stands, no. You can even see Bill Burke lamenting the same thing I am, at the bottom of the thread.
I assume this tool will not let you extend the Java language in arbitrary ways

I guess it depends upon what you mean by "extending the language." No, you can't arbitrarily change the syntax of the language, but it really is amazing what you can do with annotations, since they can be applied to just about anything except comments. I'm currently working on adding support for DBC to Java. Here's an example of what I have so far:

  /**
   * Returns the sum of Math.abs( value[ i ] ) in values for 
   * the range, [beginValue,endValue).
   *
   */
  @declare( "this >= 0" )
  public static int sum( @declare( "this != null" ) int[] values,
                         @declare( "this >= 0 && this < values.length - 1" ) int begin, 
                         @declare( "this >= begin && this <= values.length" ) int end );

what a tease

I didn't spend enough time reading the thread. You're right, it's much less powerful than I described (and had hoped).

Booger.

Yum yum

Do you need to put the code into strings in order to avoid compilation errors?

Code in strings

Yes, you must put the code in strings. Annotations support arbitrary user types but not general "code blocks".

Ah

Thanks, I see.

I somehow managed to think the Java grammar would allow arbitrary expressions there, and apt could allow you to intercept the compilation process before the semantic check.

Java syntax extender

This really doesn't look that interesting, but there is a real macro system for Java, called the Java Syntax Extender. Using a somewhat syntax-case-like system, JSE can let you make macros that actually add new flow control without making you use string syntax to do it. The new flow control looks just like builtin Java syntax. The website is http://jse.sourceforge.net/.

JSE uninteresting

I don't think most Java developers are interested in a full blown macro system, because that implies syntax changes, which implies an entire host of fragmented Java language dialects. You should see the fierce debates that rage over adding operator overloading to the language.

The opening up of a readable and writable AST, along with the ability to define well-specified annotations would mean exactly what jcheng was saying: a much better life for all sorts of bytecode and sourcecode rewriting libraries/tools. That applies both for the people who write them and for the people who use them.

Works well?

Do people do really nice things with annotations + AST-transforms in any existing compilers?

The Erlang compiler supports annotations and "parse transforms", but they're seldom used and pretty awkward compared to Scheme/Lisp macros.

I think so

It's hard to say that something works well when it doesn't exist, but my opinion is, yes, it would work quite well. I base that on two things:

1) The fact that bytecode enhancing tools are already quite popular (I've written and use several myself), and they are only growing moreso with the addition of annotations to the language.

2) The assumption that it's easier to work with an AST than to rewrite bytecode. The information available in an AST is greater than that in bytecode, and the AST for the source would be on a higher level of abstraction than that for the bytecode.

C# has two annotations that c

C# has two annotations that cause AST transforms, that I know of.

Conditional
ThreadStatic

(Actually I'm not sure how ThreadStatic is implemented; maybe the annotation is directly supported by the VM.)

Not a valid comparison

That's not really a valid comparison to a user-modifiable AST. I'm pretty sure that I, as a developer, can't write a C# annotation and a corresponding callback which transforms the AST. The support for that annotation is hardcoded into the C# compiler.

Right, not user-modifiable

I didn't mean to imply that the user could do such a thing in C# today, only that C# has two examples of useful annotation-based transforms.

Take a look at Nemerle. Their

Take a look at Nemerle. Their macro system allows macros which can be used like C# attributes (it also allows fairly arbitrary syntax transformations, but the attribute-style macros are more relevant to this discussion). For example, the [Serializable] attribute is implemented as a macro, that can be applied to a class declaration, and automatically adds the ISerializable interface to the class, and generates the Serialize() method (see here for some details). They've also used them for some more interesting things, such as DBC (preconditions, postconditions, and class invariants), and various other things.

Macros != fragmented dialects

...syntax changes, which impl[y] an entire host of fragmented Java language dialects.



I still can't understand why people think that adding macros makes a new language. Doesn't adding your own class make it a new language just as much? You have to learn what new macros do and how to use them, but the same goes for new classes. It is an issue, however, if there are two competing macro systems for Java; that would cause fragmented dialects of Java if that's what you mean (but I don't think it is).



The opening up of a readable and writable AST, along with the ability to define well-specified annotations would mean exactly what jcheng was saying: a much better life for all sorts of bytecode and sourcecode rewriting libraries/tools. That applies both for the people who write them and for the people who use them.



Isn't that an advantage of macros?

It's all about syntax.

I still can't understand why people think that adding macros makes a new language.

Most Java developers are not offended by new optional language features, they are opposed to changes in the language syntax. There will likely never be any macro-feature for the Java language that allows users to affect wholesale changes to the syntax of the language accepted by the Java compiler. Lisp macros are less "abhorrent" in this manner, because Lisp syntax is just tokens demarcated by parentheses no matter how you slice it.

Isn't that an advantage of macros?

Annotations + writable ASTs gives you those capabilities without the additional capability to change language syntax.

One step away from full meta-programming.

Since meta-programming is so useful, I wonder why Sun did not put it in Java. Annotations seems to me a half step towards mp.

Long Strange Trip

This quote helped me understand the design of Java much better:

Bill Joy had always been the guardian angel of this project, swooping in to show his support, or save us from budget cuts, but after my blowup with Wayne, he took a more direct role. He wanted to use Oak as a systems programming language for his new operating system ideas. He was often comparing Oak to more complicated and elegant languages like Python and Beta. He would often go on at length about how great Oak would be if he could only add closures and continuations and parameterized types. While we all agreed these were very cool language features, we were all kind of hoping to finish this language in our lifetimes and get on to creating cool applications with it. The more we argued with Bill about making those changes the more strongly he would fight us. After a while it became a choice between not having Bill involved at all or losing control of the language. James and I got into a rather epic battle with Bill in his office in Aspen one evening about this issue. He started out by insulting both of us about how poorly Oak compared to better languages and then he volunteered to resign from being the Live Oak architect if we wanted him to. James and I agreed that would be best and walked out to go across the street to watch "Speed". What a rush.

The next day, Bill was pretty much his normal old nice guy self again, a little relieved, I think, to be out of the role of being directly responsible for our destiny. Bill is annoyingly smart, and I wouldn't put it past him to have planned that whole scenario the night before to force us to defend our positions and take ownership of our project. The interesting thing is that we were right about needing to finish the language even though it had missing features. It was a timing issue, there was only about a three month window in which the whole Java phenomenon could have happened. We barely made it. It is also interesting that Bill was absolutely right about what Java needs long term. When I go look at the list of things he wanted to add back then, I want them all. He was right, he usually is.

Full text of The Long Strange Trip to Java.