Java Open Review Project

The Java Open Review Project identifies and reports bugs and security vulnerabilities in widely used Java open source software...The identification process is powered by Fortify Source Code Analysis (SCA) and Findbugs.

This may be an easy way to get a glimpse at what static analysis tools can do. Reviewing the potential defects identified by the automatic tools is a good starting point for thinking about the implications of various decisions about language semantics, and about language expressiveness in general.

To see actual code, you need to login using the guest account details provided on the page.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

NullPointerException

It is interesting to note how many of the problems pertain to use and misuse of the null constant; with enough foresight, the original Java language could have bypassed such problems by making reference types not be nullable unless explicitly declared to be (with a type constructor keyword nullable), and permitting nullable -> nonnullable conversions only with an explicit cast. @NonNull annotations (as in IDEA, FindBugs, and an upcoming JSR) may be better than nothing, but this is a workaround for a deficiency in the type system.

Besides (arguably) improving program safety and clarity, this might have enabled a modest performance gain by permitting dynamic or offline native compilers to omit null checks for nonnullable variables. Perhaps HotSpot can do such optimizations already for local variables that can be proved to be non-null just by looking at local context, but a stronger type system would permit the optimization even across method, class, or component boundaries.

Such a language would have some complications involving initialization sequences for fields:

class A {
  A() {
    m();
  }
  void m() {}
}
class B extends A {
  /*nonnullable*/ String s;
  B() {
    s = "something";
  }
  @Override void m() {
    System.out.println(s.length());
  }
}

The above code should be statically rejected since s will be undefined when m is called; internally implementing it as an initially null reference (as in the current JVM) would violate type safety. Some language change would be needed in order to treat the field like a blank final variable which needs to be initialized immediately. Probably the solution here is to prohibit the call to the nonfinal instance method m from A's constructor; even in the actual Java language this is usually a dangerous practice.

Nice programming language

The Nice programming language compiles to JVM bytecode and provides default non-nullable reference types and option types (nullable reference types)... and multi-methods and ...

Nullity in Java

It is interesting to note how many of the problems pertain to use and misuse of the null constant

The nowhere-near-ready-for-peer-review numbers I've seen suggest that something like 70% of bugs in Java manifest to the programmer as NullPointerExceptions. This doesn't mean that forgetting to check for null is the cause of the bug, merely that that's how the underlying
bug throws a symptom. 70% is, of course, a shockingly huge percentage, but reasonable based on my experience. Note that most "I didn't understand the spec" bugs manifest as NPEs, upping the percentage greatly.

with enough foresight, the original Java language could have bypassed such problems by making reference types not be nullable unless explicitly declared to be (with a type constructor keyword nullable), and permitting nullable -> nonnullable conversions only with an explicit cast

Sad to say, but in the context of 1995 that would almost certainly have been enough to prevent the rise of Java as a mainstream language. The perfect, here, is decidedly the enemy of the good.

@NonNull annotations (as in IDEA, FindBugs, and an upcoming JSR) may be better than nothing, but this is a workaround for a deficiency in the type system.

Considerably better than nothing. The main virtue of these constructs is that they are purely optional, and thus can be folded into developer's workflows as they become comfortable with nullity types. That said, in terms of preventing actual bugs in production software, JSR 305 is probably the most important short term initiative going.

Probably the solution here is to prohibit the call to the nonfinal instance method m from A's constructor; even in the actual Java language this is usually a dangerous practice

Decent static analysis tools (including IDEA and FindBugs) do flag calls to non-private/non-final/non-static methods in constructor/initializer context, as well as constructors where "this" escapes the constructor scope. Those are just way too subtly dangerous for comfort.

(Full disclosure: I wrote most of the static analysis tools in IntelliJ IDEA.)

Decent static analysis tools

Decent static analysis tools (including IDEA and FindBugs) do flag calls to non-private/non-final/non-static methods in constructor/initializer context, as well as constructors where "this" escapes the constructor scope.
I can see the danger of the overwritable call, but it looks like something that can also be useful. Is there a safe way to do it?

I am not sure why I would want "this" to escape the constructor scope, for an event handler? What is the danger here?

Sing# Paper

Some people at Microsoft Research wrote a paper about supporting non-null references in a Java-like language. Interestingly, non-null references subtly forced them to also tackle the problem of incompletely-initialized objects.

Declaring and Checking Non-Null Types in an Object-Oriented Language

Initialization gaurantees

I can see the danger of the overwritable call, but it looks like something that can also be useful. Is there a safe way to do it?

I am not sure why I would want "this" to escape the constructor scope, for an event handler? What is the danger here?

The danger for both of these constructs is that they allow external access to partially-formed objects. If you avoid these two constructs in Java, you can be guaranteed that any referenced object was created by a call to a constructor, and that call was fully completed before any other reference to the object (other than the language atrocities of serialization and clone(), but that's another rant). Allowing visible partially-formed objects basically makes reasoning about class invariants impossible, and can expose security holes.

Finalization to the (un)rescue

If you avoid these two constructs in Java, you can be guaranteed that any referenced object was created by a call to a constructor, and that call was fully completed before any other reference to the object.

That's not true. If your class isn't final (or have a final finalize()) a subclass can get access to a partially constructed object if an exception happens in the constructor by overriding the finalize().

It's the sixth security anti-pattern in this presentation.

Java semantics (if such thing exists) is highly complex and there are many pitfalls due to feature interaction. I don't know if there are other situations where a partially constructed object could be accessed and we may never know if we covered all our bases, unless its semantics are formally defined.

It's the sixth security anti-pattern

It's the sixth security anti-pattern in this presentation.

Thanks for this link. It is interesting to see that all the anti-patterns shown there are directly due to either mutable state or inheritence. And some are not immediately obvious, even to the seasoned programmer. Good ammunition for the next FP-vs-OO flame war. :-)

OO immutability types?

Are there any OO languages with good static analysis support for mutable/immutable? In the solution to the first anti-pattern in the slides they use a shallow copy; it would be nice to have the compiler guarantee the contents of the mutable container really are immutable. One could then even perhaps go on to have the compiler optimize things like clone() based on knowing the im/mutability of the items.

You could presumably use generics and/or make new types in Java such that the collection could only have been constructed so that it contains no mutable objects. So do you roll everything yourself, or does your language give standardized things to you? Nullness, immutability... perhaps things which are too important to leave out of the core language?

Your order is ready sir

A language that supports OO* and has good static analisys for mutability? You don't need to search further than Haskell.

*For large values of OO and small values of support. YMMV. This offer is void if you aren't a Haskell über-hacker.

The ST monad and Oleg's

The ST monad and Oleg's regioned variant of it're worth mentioning under static analysis for mutability, too.

Unknown agent gets hold of hazardous item

Ok I see now, they are both the same: an unknown method gets a reference to a partially-initialised object 'this'.

Would we need a special marker to say "method called by the constructor, do not read the value of instance variables in here" for init-type methods? And maybe, the constructor should hard-wire unmarked method calls to use the non-overridden version of them. Making the methods temporarily 'final' until the constructor returns. Which could break some (possibly buggy) programs. And I am not sure if it is type-safe (sound?).

The object should also make sure that it registers itself (eg. to an event handler...) after all initialisations, but Java does not support "after" blocks. Alternatively there could be a contract that the registrant does not access the object immediately, and that users run in the same thread as the object constructor. That seems very difficult. Defining an "after" block that is executed after all the constructors finish is undoubtedly easier.

Constructor should be a class method

If class were objects (as in Smalltalk) you could use class methods to initialize the object, not instance ones. Also it would make factory patterns a non-issue. The object state could be defined as a simple record (using & for object composition):


class Foo extends Baz {
    final int x;
    int y;
    //class method
    Foo Foo.new(Bar bar) {
        int i = baz() //calling class method named baz
        Foo this = Baz.new(bar) 
        return this;
    }
    //class method
    int Foo.baz() {
        //no this available
        return this.x * this.y; // error
    }

    int baz() {
        return x * y
    }
}

No problematic partial objects, descriptive constructor names are possible, polymorphic constructors (i.e. just pass an instance of Foo's subclass and invoke dynamically), hook methods in class methods are ok, separate validation and initialization issues, may cache objects instead of creating new instances all the time, etc. the benefits are many without going too far from Java's syntax.

Is there any reason this

Is there any reason this isn't done? I have always been amazed by the need to create a special syntax for just this operation. I guess it might be just to be closer to C++ syntax, but well that seems like a very veak argument.

If you don't have a special

If you don't have a special name for all constructors, e.g. new or ctor, you end up having to pass the name as a parameter to other classes. Think of a collection class that needs to be able to create default objects, it needs to know the name of the c'tor to use. Eiffel has to do this for example.

Named constructors are very useful though imagine a vector object with Vector2d.cartesian(x, y) and Vector2d.angular(rho, length). Or Length.metres(m), Length.inches(i). There is a named constructor idiom for C++/Java/C# that emulates this though.

You still need syntax to create the 'naked' object. Based on the example above using C9x-esque structure litral sysntax:

Foo Foo.new(int x, int y) {
    Foo this = { .x = x, .y = y };

    return this;
}

In general I think the reason this elegant and clean solution isn't used is lack of imagination.

Yes, much better...

For a very nice system which includes this idea, see A Core Calculus of Metaclasses. I really enjoyed this paper and seem to recommend it every chance I get... I'd love to play with a language based on this.

proving properties in java?

Does any one have any references that talk about proving properties about programs (written in mainstream languages such as java, c, etc.) in the following manner:

Either through an IDE, 'annotation' or dummy methods, say what should be true:

int doSomething(int myargument){
mustBeTrue(myargument,">", 0); // return doSomethingElse(myargument);
}

The mustBeTrue predicate must be checked before the program is run, if it can't prove the property, it should say so.

What goes into writing a static checker such as that? My initial hunch is that I would have to learn about theorem provers, perhaps constraint programming?

ESC-Java and JML

Google for ESC-Java and JML. There's a lot of work out there.

See Also...

...Krakatoa, Daikon, and DSD-Crasher.

Ob. FP note: one of DSD-Crasher's developers, Yannis Smaragdakis, is also co-developer of FC++, arguably the first really good implementation of functional programming techniques in C++ (but see Phoenix V2 for the current state of the art).