Lambda the Ultimate

Womble
started 7/10/2001; 6:23:35 AM - last post 7/11/2001; 9:50:41 AM

7/10/2001; 6:23:35 AM (reads: 2098, responses: 4)

Womble

Womble is a lightweight tool for extracting object models from Java bytecode. From bytecode, Womble extracts object associations, inheritance links, and class dependences. Womble also infers types in collections (such as hashtables), so that the class in the collection, rather than the collection class, appears in the object model. The object association edges are annotated with mutability and multiplcity information which is automatically extracted.

This looks interesting. I see Womble uses DOT to draw the graphs, and I had trouble installing DOT when I tried a year or two ago.

I think that the ability to decompile bytecodes to meaningful diagrams should tell us something about the sad state of programming languages. To see what I mean, reflect on how machine language is removed from high-level languages, in terms of abstraction power.

I wonder if we need a program analysis department? This is an important area of study, of course.

Posted to OOP by Ehud Lamm on 7/10/01; 6:24:50 AM

John Lawter - Re: Womble

7/10/2001; 7:43:11 AM (reads: 1051, responses: 0)

I think that the ability to decompile bytecodes to meaningful diagrams should tell us something about the sad state of programming languages

I'm not sure I understand what you mean by this. I mean, isn't it good that Java is using a machine language which *preserves* information (that is, if you consider compilation a compression process, the compression of Java source into Java bytecode is less lossy than the compression of, say, C into machine language ) ?

(or have I answered my own question ....)

I agree with you that Womble sounds interesting; along with tools like jad it can make up for poor documentation, when source isn't available.

Ehud Lamm - Re: Womble

7/10/2001; 10:58:29 AM (reads: 1032, responses: 0)

Well, aside from VM fascination I don't see why VM bytecode should preserve the information in the high level source. Where do you draw the line? Why shouldn't we program the VMs directly?

Seems to me that high level languages are supposed to offer better abstractions, and better structuring methods for large system, than what's available at the machine or virtual machine level.

That' why I find situations like this to be fishy. Is it really that the VM is expressive, or that the source language isn't?

John Lawter - Re: Womble

7/10/2001; 2:00:23 PM (reads: 1028, responses: 0)

Well, I think some information is useful. From a debugging standpoint, (and this is purely a personal thing) when I print a reference, I like seeing the class of the object it refers to, along with the object's location in memory. Java's VM isn't the only one that does this, right ? Don't Lisp VMs also "tag" pointers ?

I think a language designer must draw the line when the info being preserved isn't useful to the runtime system. For example, in some circumstances, it may be useful to tag a method or variable with a permission or capability, so that the runtime system can ensure that only approved classes are calling or accessing it. Also, code segments could be tagged "initialization only" or something, so that code used only once per program execution could be flushed from cache right away rather than taking up space.

Ehud Lamm - Re: Womble

7/11/2001; 9:50:41 AM (reads: 992, responses: 0)

There's quite a difference between retaining information for debugging purposes, and having to retain the information for correct execution.

Some languages, think of Scheme as an example, are based on retaining type information. An optimizer can sometimes reduce the amount of information that needs to be handled during runtime.

More static languages are designed in such a way that all type information (for example) that is not directly needed becuase of explicit mechanisms like late binding, can be reasoned about and eliminated during compile time. Without good reasons to do otherwise, I find this approach better.

Some of this discussion pertains to the discussion on generics in the CLR, which touched on carrying type information during runtime.