Battling Bugs: A Digital Quagmire

From this Wired article:
In 1976, computer pioneer Edsger W. Dijstra made an observation that would prove uncanny: "Program testing can be quite effective for showing the presence of bugs," he wrote in an essay, "but is hopelessly inadequate for showing their absence."

Thirty tears later, Dijsta's words have the ring of prophecy. Companies like Microsoft and Oracle, along with open-source projects like Mozilla and Linux, have all instituted rigorous and extensive testing programs, but bugs just keep slipping through. Last month, Microsoft's monthly drop of bug patches included fixes for 14 security holes that escaped prerelease testing, four of them rated "critical."

There's a bit to chew on here, including the by-now-de rigeur misidentification of Java, Python, and Perl as "type-safe languages." But I think the article is valuable in spite of that for its frank admission that even intense testing regimes aren't doing well at addressing serious quality issues. Daniel Jackson, leader of the Alloy project at MIT, is quoted.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Type Safety

The term "type safety" occasionally means different things to different people; as the claim that Java et al are not "typesafe" illustrates.

A couple of different levels of type safety perhaps might be worth mentioning:

* Absolute type safety--type errors are not possible in a well-formed program; the implementation guarantees that every operation on every object will succeed; else the program is not well-formed. Type erasure is possible and is often performed as a runtime optimization. The statically-typed FPLs are in this category.

* Nominal type safety--type errors are not possible in a well-formed program, unless explicit "dynamic" constructs (instance_of, dynamic_cast, class introspection) are used. When type errors do occur; the behavior of the system is deterministic and it's integrity is not compromised. Java 1.4 comes to mind (though not Java 5, due to issues with the generics implementation).

* Nominal type safety with exceptions. Type errors are not possible in a well-formed program; unless explicit "type unsafe" operations are performed. Such explicit operations may, if done incorrectly, cause undefined behavior and/or violate the integrity of the programming model. Examples include C#, Eiffel, and Modula 3.

* Dynamic type safety. Type errors are possible in well-formed programs (and do not require use of explicit constructs which "waive" typechecking). The behavior of the system in the presence of type errors is well-defined; programs can often productively "trap" type errors as a metaprogramming construct. Java 5 is one example; as are most dynamically-typed languages (Smalltalk, Python, Ruby, Lisp).

* Superficial type safety. The system performs typechecking on programs (and rejects many improperly-typed programs as ill-formed); but the language definition contains numerous "holes" (which even experienced programmers may unwittingly stumble into) that can cause type errors to occur undetected at runtime, resulting in undefined behavior and violation of the integrity of the programming model. Examples include C and C++.

* Minimal type safety. The system performs little or no type analysis, either at compile-time or runtime. Type errors may have undefined behavior and compromise system integrity. Examples include assembler, BCPL, and most Forth dialects.

By use of more precision in our language, and by avoiding binary claims likely to be considered inflammatory (such as "Java is type-unsafe", when a different level of type-safety is considered important by speaker and listener), arguments might be avoided.


I don't know enough to know if one can poke holes in your comments; they make a lot of sense to me and are appreciated. It always helps me tremendously to see such categorizations.

Great Taxonomy!

First, this is an excellent taxonomy that I will refer to from now on—too bad I've never seen its like in any of the literature.

But Dave Griffith has responded exactly correctly, IMHO, and said what I should have: the claim isn't that Java, Python, or Perl aren't type-safe by any definition of that term, but rather that the property that they have that Garfinkel called "type-safety" was rather "memory-safety." From that point of view, the observations following in the article make much more sense: memory-safety is indeed only one, relatively trivial, aspect of safety. But a naïve reader will likely come away with the perception that type-safety doesn't help avoid a wide class of bugs—in this case, a wider class than memory-safety alone can provide.

Interestingly, the example given of what "type-safety" can't do, ensuring that a lock is used correctly, is exactly one of the things that linear type systems do. So I'm afraid it's another case of conflating what popular type systems don't do with what static type systems can't do. If people who write about the topic would just quit making this particular class of errors, you'd find me writing a lot less on the subject.

In any case, that wasn't what interested me about the article most. What interested me most was that Garfinkel bothered to look at big, complex systems with big, comprehensive testing regimes and observe that there are still really unfortunate gaps with them, and he pointed to Lightweight Formal Methods as an alternative. I find that fascinating even leaving aside the confusion over type-safety vs. memory-safety.


...although I suspect that the taxonomy could be tightened up a bit (the issue of "unsafe" constructs ala C# or Modula 3 is really orthogonal to some of the other issues) and better formalized. Better, more appropriate labels might be invented--mine were rather ad-hoc.

Hmm... maybe someone should publish this (once done). As I currently reside in industry and not academia; I've no particular pressure to do so.

A couple of other interesting points which might be worth addressing, should someone want to take this further:

* The effect of substitutabilty, especially in type systems with subtyping. OO languages, for instance, by allowing non-Liskov subtyping, arguably permit more errors to be introduced into a program which passes the typechecker or inferencer. OTOH, proponents of OO claim (and I agree with them) that the ability to have different types which have some similarity (represented by a supertype) but observable differences otherwise, is a good thing.

* Type coercions. Many languages allow implicit type coercions which can allow incorrect programs (i.e. ones that don't do as intended) to pass the typechecker, often with bizarre results. Multiplication of strings is a longstanding example.

Memory-safety fine distinctions

Even if Java makes "core dumps" impossible, it still keeps a possibility of many bugs related to memory that are just too easy to make and very hard to fix.

Not even mentioning OutOfMemoryError, there are also issues with Java memory model leading to concurrent bugs, and lately I've stumbled upon problems with permanent generation space while extensively generating classes on the fly.

Do we need another taxonomy for memory-safety? Or rather a branch on the existing one for type-safety?

I'm not sure how the type system

can prevent out of memory bugs. (Actually, I can think of many ways that languages can guarantee their absence; it's just that those ways will make the language difficult to use and/or unsuitable for most applications).

Likewise, I'm not sure that I'd classify concurrency failures as type system bugs--the Java type system says very little about concurrency (other than that reads/writes to certain things are required to be atomic). One can, of course, treat any interesting program invariant as a type, but making practical use of such capability is still the stuff of research.

Also, I was under the impression that many of the issues dealt with in the paper had been resolved in more recent versions of Java. In particular, wasn't double-checked locking; long known to be broken, fixed in Java 1.4--the paper you cite is quite a few years old, and recent changes to the Java spec have been made to deal with these issues.

Of course, I'm of the opinion that unrestrained shared-state concurrency is asking for trouble in the first place--and that the fatal flaw of Java in this regard is its embrace of shared-state concurrency; something which won't be fixed by tweaking either the type system or the JVM. (Perhaps introduction of domains, or some other way of segregating objects into "referenced by multiple threads" and "referenced by a single thrad", might be a start).

Java's types aren't even close to real types

I'm going to try to phrase this carefully.

MLKit is a region inferring ML compiler. It can statically prove the memory bounds of code. It does not make the language any harder to use than plain ML, since it's just plain ML. The question of ease of ML vs Java is open, obviously. I believe they tote it as being great for the embedded space. It actually still uses a GC, but for a much reduced set; does anyone know whether it would be possible to entirely do away with it? Would it be more possible in Haskell where purity is guranteed?

Similarly, shared-state concurrency is not a bad thing. It should be used at the same time as message passing concurrency, and can be done safely. The software transaction memory work done by the Haskellers recently has shown how to compose share state operations together safely. In Haskell, this can, and is, guranteed by the type system. I would go so far as to say that right now, the problem of shared state concurrency in Haskell is solved; I'm sure I'll get picked up on that.

In short, both of the points that you raised are true, in the sense that they are not necessarily failings of the type system. However, there is much that a good type system can do to help. I also wonder how the DT crowd would approach these two problems? Is there a standard method(s) available?

It's my understanding that re

It's my understanding that region interference can compleatly remove the need for a garbage collector, but that the resulting system svallovs up to five times as much memory as a system with GC would.

The Metronome people at IBM h

The Metronome people at IBM have a (somewhat firm, from what I understand of the various arguments) real-time garbage collector for Java. I wonder if some combination of region inference (to narrow down the amount of memory to be collected) and such a "real-time" garbage collector might get closer to the limits and requirements of the embedded/RT space.

There's always the IO monad

It actually still uses a GC, but for a much reduced set; does anyone know whether it would be possible to entirely do away with it? Would it be more possible in Haskell where purity is guranteed?

Despite purity Haskell still allows effects in the IO monad, so we still would need GC for MVars, IORefs and their friends.

Java Memory Model

As was already mentioned, you should update your papers about the JMM - JSR 133 changed it for Java 5. See the FAQ, for a quick reference on the changes.

Yes except i don't understand

Yes except i don't understand how divide by zero avoids being a type error in most statically typed FPLs.

By not being worth the pain c

By not being worth the pain caused ("you mean I've got to manually convert to non-zero-int before each division?"). It's not as if most statically typed FPLs don't have partial functions.

Wouldn't every language have

Wouldn't every language have absolute type safety then, depending on how much pain you can stand?

You mean "it's ok, it's just

You mean "it's ok, it's just a partial function that happens to crash on that input"? :-) Yeah, but the alternative involves a different kind of pain that'd make much arithmetic unbearable. IMO it's acceptable in the presence of exceptions in a way that falling over because you tried to treat that integer as a pointer and dereference it wouldn't be. Undefined behaviour and exceptions're very different levels of safety.

Yes, I'm not objecting to the

Yes, I'm not objecting to the practice in it self, it's more about the classification. (And I seem to take for granted that functions raise exceptions rather than do something undefined.) I just think you shouldn't go calling something "Absolute type safety" then the only thing that differencing it from what you call "Nominal type safety with exceptions" is that the former only includes the exceptions that you personly happen to like. I would like some more formal distinction. At least for something containging the word "absolute".


I've never seen the term "absolute type saftety" used by rsearchers in the field. Quite alot of effort goes into specfiying exactly what the terms mean, including the specification of "variant errors" (like division by zero), not handled by the type system.

EOPL says something about this (concerning division by zero specifically, IIRC). Shriram's book talks at length about the precise wording of the type soundness theorem.

(though not Java 5, due to is

(though not Java 5, due to issues with the generics implementation).

Interesting! Could you elaborate, or give a pointer?

It's been discussed before.

Quite a few papers on the topic (many of them coming from the authors of a rival generics proposal--one not accepted by Sun/JVP--which doesn't have this problem, but requires much more substantial changes to the JVM in order to work. The rival generics implementation is similar to how C#/DotNet does it).

Perhaps I'm being a bit unfair to Java 5, as the issue doesn't arise unless you use introspection. However, the problem is essentially as follows:

Note that I use square brackets ([]) rather than angle brackets (<>) to delimit the arguments to generics, because the site software treats the latter as markup and I'm too lazy to type &lt; and &gt; all over the place.

* Java 5 generics use a combination of type erasure and silently-inserted downcasts. If you type Vector[Int] or Vector[String]; the resulting class is essentially the same as Vector[Object]. As far as the JVM is concerned, the classes are one and the same; the distinction between different parameterized types is made ONLY at compile-time.

* When you retrieve something from a Vector[X], Java essentially inserts a downcast in the retrieval operation which verifies that the retrieved object (remember, the underlying container as implemented has no knowledge of the type parameter) is indeed of type X. As the compiler verifies (at compile time) that only X's are inserted into a Vector[X], this downcast will usually always succeed in a well-formed program. This downcast is required by the JVM, as as far as the JVM is concerned, the type is a Vector[Object].

* If, however, you use introspection to retrieve a Vector[X] from something, it is easy to fool the compiler into thinking that a given term is of type Vector[X] when it's really Vector[Y]. Remember, the JVM/runtime CANNOT TELL THE DIFFERENCE.

* When the language is so fooled; and one attempts to retrieve an X from something which is really a Vector[Y], the inserted downcast will fail, throwing BadCastException. The exception occurs in a place where the user doesn't expect it--during the middle of an assignment operation which appears (to both human and compiler) to pass the typechecker.

"Type Safety"

The actual problem with the "type safety" comment is that the author of the article doesn't actually mean anything about types. He's refering to memory models, and the fact that the languages he lists prevent dangling pointers, buffer overflows, and all those other exciting things that made programming with 1970's era system languages such an enjoyable now-lets-juggle-razor-blades experience.

Simpson Garfinkel. He actually wrote a pretty good book on Objective-C sometime back. I would have thought he'd know better.

Worse he acts like this solves most problems

I don't wish to downplay the cost of what the author refers to as "type safety", but it isn't as big as he would leave us to think.

The author of the vstr library did a study of real-world problems on RedHat Linux. He concluded that only about 40% of all serious problems could be prevented with a better language. This would be a great improvement, but it isn't near as large as most people think.

In the authors defense, if you read the second page of the article he makes it clear that there are other problems. However between buffer overflow type problems listed first, and that few people will read to the end, the impression is that buffer overflows are the only problem.

In truth buffer overflows are the easier problems to fix, and generally the easiest to spot.

I suspect that if Multics was used as an Internet server today it would have been cracked a few times. The parts of the puzzle are just too many and too complex. Though it would have much less successful attacks that the typical Unix or Windows system. OpenBSD was cracked, and they put as much effort into things and Multics did.


Henry Miller: The author of the vstr library did a study of real-world problems on RedHat Linux. He concluded that only about 40% of all serious problems could be prevented with a better language. This would be a great improvement, but it isn't near as large as most people think.

I'm confused: I can find neither the 40% nor the "better language" claim by following the provided link. In any case, one of the major arguments that we proponents of richer type systems make is precisely that there's more to safety than memory safety.

Henry Miller: I suspect that if Multics was used as an Internet server today it would have been cracked a few times. The parts of the puzzle are just too many and too complex. Though it would have much less successful attacks that the typical Unix or Windows system. OpenBSD was cracked, and they put as much effort into things and Multics did.

There's also more to security than perimeter security; see the E programming language and related links for the most in-depth coverage of the issues extant, as far as I'm aware. Also, saying "OpenBSD was cracked" is more than slightly disingenuous, given that it's had "Only one remote hole in the default install, in more than 8 years!"

the claims

The 40% claim comes from other posts (elsewhere) by the author. Reading the page you can see that it is close to that, though clearly only 1 digit is significant.

I don't want to get into the pros/cons of type safety. Smarter people than me have argued about it for years. There are some advantages, but there are also disadvantages.

"Only one remote hole in the default install, in more than 8 years!" was the exact point I was trying to make. Smart people who focus on security first, and carefully program (though I don't know if they use formal methods) still missed something.

E is very interesting, but the model can be broken, particularly if you do stupid things such as grant excess capabilities to something. There are many stupid programers out there. Worse there are many lazy programers. If E was common, then you can be sure that it would be common to see code that just gave some program all rights when it only needed a small amount because it is easier (you can write a script to give your program rights to everything, restricitng rights requires hand work)

Perfection is Elusive

Regarding OpenBSD, I think you might have an unrealistic standard if one remote hole in over eight years doesn't sound like an impressive accomplishment, especially for a very traditional UNIX system!

Regarding E, I think you're largely mistaken: E doesn't lend itself to writing code that takes a "just make everything available" stance, to put it mildly. In particular, "You can write a script to give your program rights to everything, restricting rights requires hand work" is exactly backwards. It takes a conscious effort to expose capabilities in E, since "only connectivity begets connectivity." The one thing you can't do (not just E, but anyone) is prevent subjects who should be allowed to communicate in the context of a system from delegating capabilities, but capabilities win because you shouldn't be able to express a security restriction that you can't enforce.


You misunderstand. I am very impressed with openBSD. However despite all their efforts they still failed once. Only once (so far), but that is still one time when if you were running openBSD you were vulnerable because of their mistake

As for E, they make it hard give excessive rights. However I know some programmers who will get tired to figuring out what rights everything needs, and just give all rights to everything because it doesn't matter. The E documentation lists some ways you can create a security hole with E. They are things that only an idiot would do, but there are plenty of idiots out there who program.

Better languages

I don't understand what you're getting at when it comes to E, or in general, languages which enforce desirable properties. Any language can be subverted to some extent by a programmer who doesn't know any better, or one who is determined to do so. Perfect idiot-proofing is not a practical goal for Turing-complete languages.

A more important practical goal for developing better languages is to enable competent professionals, using proper processes, to do their job better, and make fewer mistakes.

Coming back to a point in an earlier comment:

He concluded that only about 40% of all serious problems could be prevented with a better language.

It's worth emphasizing that this improvement is purely based on focusing on one feature, string safety. A "better language" can provide all sorts of other advantages. The list of vulnerabilities that vstr might have avoided includes no analysis of which other language features might have addressed other vulnerabilities — use of E would certainly have addressed quite a few of them, for example, and use of garbage collection would have addressed others. The total percentage that would have been addressed by many "better languages" is clearly much higher than 40%, just based on a casual inspection of that list.

As for OpenBSD, that's a testament to what can be achieved with even very unsafe languages, using good people and good processes. Imagine what those same people might have achieved if they hadn't had to spend so much time doing jobs that their languages, used properly, could do for them.

Well you could for example ma

Well you could for example make a VM that asks the user for permission before doing anything dangerous. Then a language targeting that VM coudln't possibly make an unsafe program. (basicaly the same idea as a capabillitys secure operativ system.)

User-friendly representation of capabilities

Ignoring the performance and annoyance issues, how would you represent an arbitrary "anything dangerous"?

It's easy to represent very coarse capabilities, like reading the specific file, but I estimate number of capabilities in programs that really use them to be so huge, and differences between them being so intricate, that no user could tell the difference between them, even if we invent some wonderful renderer for capabilities.

A reasonble question is - if a user cannot cope with this load, how a developer is supposed to? One of the answers would be compositionality. The developer deals with capabilities by baby steps. E.g., in OO setting, the developer can think of some object x as "the A I got from y" and about object y as "the B I got from z", etc., not as "the object with OID 1234567890". Hope my clumsy explanation is somewhat understandable :-)

Granma's Rules of POLA

Granma's Rules of POLA are supposed to provide some useful rules of thumb for the everyday user of a capability based system.

Edsger Wybe Dijkstra! (not Dijstra or Dijsta)

He was pretty cool.

Here you can watch some interviews (English with Dutch subtitles):


Program requirements have lots of forms.

The reason there are bugs is that program requirements have lots of forms and they can extend over 1, ten or thousands of code lines. Modern programming languages are all poor at testing requirements as wholes.

Modern type systems concern values, but what about the interaction between and application of values? type systems do not handle those things.

For example, there is no such type system that can statically check usage of B on A after A has been initialized. It has been discussed here in LtU with an example about files: there is no type system, for example, that enforces usage of a file after it has been opened.

There are. The problem is whe

There are. The problem is whether a file is known-open (or in the example, known-closed) in a given branch or not, although typing can also be used to force the presence of run-time checks ("you can read if you've checked it's open").

The problem would be solved by user-defined annotations.

example (I remember I have posted this before, but anyway):

File create() -> closed;
void open(File file -> closed) -> open;
void close(File file -> open) -> closed;

Then a compiler could easily check the flow of the code to see if the user-defined states are honoured by the code.

Program flow should be strongly typed.

A mechanism already exists

You're trying to fake dependant types again. Open and closed are obvious subtypes (or parameterisations - File Open vs File Closed vs File any) of File. Then you just do the linear types thing, much as you're requiring.

Indeed, but... is a nicer (methinks) way to declare those types, as well as being open to extend the types without much strange grammar. It is something the average Joe programmer will consume much more easily.

Not as-is, you've not got a f

Not as-is, you've not got a fully-specced mechanism. I do like the idea of sugaring "run this linear value through a function and rebind a return value of some related type to the same identifier", but parameterising on a record or similar structure seems to be a far more appropriate way to allow the type to be extended with new properties. Otherwise you're left defining things like how individual or states relate to each other. Also, your syntax doesn't seem to handle things like multi-parm functions at all?


It was simple but pretty cool. Made by one of the people who worked on Xok. Here's a link. You'll have to figure out which papers I'm referring to (and check references). That said there is tons of cool stuff there.

Much more than just type systems

Most of the discussion is about type systems, as if they're the key to the problem. Remember, Dijkstra was all about building correct programs from specifications, which is a much bigger problem than simply having a well-typed program (though no one would argue that a solid type system isn't one part of a provably correct program.)

Thinking back to the problems I worked through from Dijkstra's books, they were all relatively simple type-wise, but that doesn't mean that deriving the programs was easy by any means!

What if requirements were types?

I have worked extensively with requirements documents provided by external companies, and I can tell you that requirements are actually types. Requirements express invariants (static/dynamic), information structure and process flow; these concepts can already be expressed with types.

If those requirements could be expressed in a formal way, then an implementation could be checked against them for correctness.

I suppose you could look at it that way

But I'm not sure how much that buys you. It still feels like this is being force-fit into a type system issue, when I don't think that it is at all.

Creative Tension

It's true that what you can formally define (nevermind what you can intuit!) is still broader in scope than what's covered by any extant type system. This is precisely what drives type system research: we'd like to design type systems that type more programs than is currently possible, do so with a real economy of annotation, and when annotation is necessary, not have it be completely alien. So far, the progress on the first goal seems to be excellent, the progress on the second goal pretty darned good, and progress on the last goal absolutely abysmal. IMHO, of course.