Lambda the Ultimate

inactiveTopic Udell: Test before you leap into development
started 8/4/2003; 9:25:56 AM - last post 8/11/2003; 11:32:43 AM
Ehud Lamm - Udell: Test before you leap into development  blueArrow
8/4/2003; 9:25:56 AM (reads: 2731, responses: 37)
Udell: Test before you leap into development
A short exposition of TDD with quotes from Ward Cunningham and Brian Marick.

Interestingly enough Jon doesn't stress two things I find to be of fundamental importance: TDD is more appropriate to some programming languages and architectures than to others, and IDE support is quite helpful for xUnit to take hold (he does mention the "green bar" though).

Jon posted a few more bits from his interviews with Ward and Marick to his weblog.


Posted to Software-Eng by Ehud Lamm on 8/4/03; 9:27:37 AM

Marc Hamann - Re: Udell: Test before you leap into development  blueArrow
8/4/2003; 10:49:31 AM (reads: 1794, responses: 0)
TDD is more appropriate to some programming languages and architectures than to others

Hmmm. If you said TDD with xUnit I might agree, but I'm not sure that you couldn't be productive with the basic concept of TDD in any language or architecture. What specific examples did you have in mind?

IDE support is quite helpful for xUnit to take hold

While I use an IDE that natively supports JUnit and refactorings, many TDD afficionados prefer the command-line version of xUnit and do all their refactorings the "hard way". Obviously having automated help is always a bonus to the popularity of a discipline. ;-)

Tayssir John Gabbour - Re: Udell: Test before you leap into development  blueArrow
8/4/2003; 1:24:49 PM (reads: 1764, responses: 1)
TDD is less appropriate to Design by Contract languages or anything else that does work at compiletime?

I can imagine a great PR campaign for TDD. "You're safe when the bar is green." Very Pavlovian.

Marc Hamann - Re: Udell: Test before you leap into development  blueArrow
8/4/2003; 1:36:41 PM (reads: 1802, responses: 0)
TDD is less appropriate to Design by Contract languages or anything else that does work at compiletime?

I'm not sure why this has to be so. Java does compile time work (ensure referenced classes exist, type checking, etc.). You just set that stuff up as part of the "red bar" phase, i.e. get it to compile so it can fail.

Patrick Logan - Re: Udell: Test before you leap into development  blueArrow
8/5/2003; 10:43:32 AM (reads: 1649, responses: 0)
TDD is less appropriate to Design by Contract languages or anything else that does work at compiletime?

I have found TDD and DBC to be complementary. TDD is a design approach for creating a design in small increments. Each increment may add something to the contract.

Do you see a conflict?

Tayssir John Gabbour - Re: Udell: Test before you leap into development  blueArrow
8/5/2003; 8:20:53 PM (reads: 1597, responses: 2)
I have found TDD and DBC to be complementary. [...] Do you see a conflict?

No, I'm just reading Ehud's words to mean there's a continuum to how vital TDD is to writing large programs in some language. Pythonistas seem to rally around TDD as a response to static typing critics. So on that argument alone Python's higher on that continuum than if it had more static checking, because Python has certain disadvantages that are filled by TDD.

Of course, I'm just guessing at his meaning. I've always needed some form of unit testing when writing pure Java apps, because it's so verbose and bondage&discipline that it's easy to miss the forest for the trees.

The more I think about it, the more I think his statement was mainly meant to be provocative. TDD's advantages are also in documentation, morale and other things which aren't necessarily language issues.

Ehud Lamm - Re: Udell: Test before you leap into development  blueArrow
8/5/2003; 10:22:46 PM (reads: 1616, responses: 1)
To clarify: I never said TDD and DBC don't go together. In fact I didn't mention DBC at all.

I only said that the applicability of TDD depends on the programming language. As an extreme example think about doing TDD in assembly. It is too expensive (you need to write too much code), too unreliable (it is quite likely you'll get the unit tests wrong) and dangerous (the unit test code changes offsets in the code, affects addressability etc.)

Even with languages that are conducive to TDD, some are better at it than others.

Frank Atanassow - Re: Udell: Test before you leap into development  blueArrow
8/6/2003; 1:47:08 AM (reads: 1576, responses: 1)
TDD is less appropriate to Design by Contract languages or anything else that does work at compiletime?

DBC does not do work at compile-time.

Vesa Karvonen - Re: Udell: Test before you leap into development  blueArrow
8/6/2003; 4:38:29 AM (reads: 1625, responses: 0)
As an extreme example think about doing TDD in assembly. It is too expensive (you need to write too much code), too unreliable (it is quite likely you'll get the unit tests wrong) and dangerous (the unit test code changes offsets in the code, affects addressability etc.)

I'm not sure that I buy this argument. Let's look at the points:

It is too expensive (you need to write too much code)

I think that it is most likely to be much more expensive to not unit test the assembly code. Assembly programming is highly error prone. If you don't test, you are likely to have many many bugs that would have been caught if you did TDD. If I were writing something in assembly language, the last thing I would do is to avoid testing because it is costly. This point simply does not make sense to me.

Also, if you need to write many tests, you can write some routines (and macros) to reduce the work needed to write tests in assembly language just as you can do in other languages.

too unreliable (it is quite likely you'll get the unit tests wrong)

Let's assume that you get a test that is wrong.

Assuming you'll end up with a double negative (tested code is wrong, test is wrong, but the errors cancel each other out), you end up with incorrect code. Is this assembly language specific? No it isn't. You can easily write both an incorrect code to test and an incorrect test that will cancel each other out in any language. Hopefully you'll eventually run into the bug before you ship. The more tests you write the less likely that you never run into the bug.

Otherwise the test incorrectly says that the code is wrong, so you will use a debugger to debug the code and you will eventually find out that the test is wrong, fix the test and continue. Is this assembly language specific? No it isn't. Actually, one of the first things that I do when I get a failed test is to look at what the routine actually produces. If the output seems correct, I first assume that the test is wrong.

So, a bug in a test costs you some time. This is the same in every language.

dangerous (the unit test code changes offsets in the code, affects addressability etc.)

I don't think that this is a real problem. One of the main purposes of an assembler is to automatically compute offsets and make sure that addressing works. So, if the test code changes an offset, the assembler will still compute the offsets correctly and will give you an error if, for example, the offset gets too large, in which case you can move and modify the test code so that it doesn't cause the problem anymore.

Of course, why would anyone implement the test code in such a way that it might cause such problems with offsets in the first place? Normally you would be placing the test code and the code to be tested into separate source files and assemble them separately so that you can easily drop the test code when it isn't needed. The assembler and linker will ensure that the offsets and addressing will work.

...

The above assumes that the tests are written in assembly. It is quite often the case that you can (and should) write the tests in some higher level (compared to assembly) programming language. This is exactly what I have done numerous times. Some years ago, I wrote an optimized memory copy routine for the SH4 on the Dreamcast console. The test code, which basically did input/output equivalence class coverage testing of the assembly code, was written in C++.

...

I'd say that the lower the abstraction level of the programming language that one is using, the more tests should be written. Not the other way around.

Daniel Yokomiso - Re: Udell: Test before you leap into development  blueArrow
8/6/2003; 6:18:03 AM (reads: 1555, responses: 0)
DBC does not do work at compile-time.

While there are some contracts that can't be enforced at compile-time, nothing stops a compiler to check contracts. Spark does something in these lines, AFAICT.

Disclaimer: I only read papers about it, never used the tool.

Patrick Logan - Re: Udell: Test before you leap into development  blueArrow
8/6/2003; 2:53:54 PM (reads: 1476, responses: 3)
As an extreme example think about doing TDD in assembly. It is too expensive (you need to write too much code), too unreliable (it is quite likely you'll get the unit tests wrong) and dangerous (the unit test code changes offsets in the code, affects addressability etc.)

It's been a *long* time since I did any assembly language programming. But back then I used macros to take care of offsets and addresses. Are there really assemblers these days that do not have a good set of these features?

If so, and I had to write a lot of assembly, the first things I would do is write a little language in Scheme to spit out this target language. Then since the target system is probably fairly constrained in other ways (to justify such poor programming tools) I would look into writing an emulator (in Scheme) to support better instrospection during development.

If you don't have an agile world, the first thing to do is consider how you can create an agile world around it.

Stuart Allie - Re: Udell: Test before you leap into development  blueArrow
8/6/2003; 8:23:00 PM (reads: 1473, responses: 1)
I don't want to upset people, but I find it surprising that people at LtU, who a generally speaking concerned with program correctness, find TDD attractive. The idea of "make it build, make in run, make it right" makes me shudder. I can see many people omitting the last step altogether. Once it passes the unit tests, the program *must* be correct, let's ship it! On the other hand, if you make it right, it *will* build and run...

The only thing a unit test can tell you is that your code is broken. It can *NOT* tell you that it is *correct*. You could write a program that passed every unit test merely by having a big lookup table of the expected test results - but it wouldn't be a very useful program.

The only time that unit tests tell you *anything* about program correctness is when you have guaranteed 100% coverage with your tests. But wouldn't constructing such a collection of tests constitute a proof of correctness anyway?

I support unit testing as a pragmatic means of finding *some* bugs while developing code, but it is no substitute for reasoning about programs in a rigorous way.

I know advocates like John Udell go out of their way to say that TDD is *not* a silver bullet, but I am sure it will be glommed onto by people out to make a buck as the "next big thing" and, like OOP, will hold back the development of truly robust software for a few more years.

Sigh.

Ehud Lamm - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 12:09:33 AM (reads: 1482, responses: 2)
I guess I didn't make mysefl clear, so let me try to be more concrete.

I used ASM/370 quite a lot, so I'll talk about this specific assembly language/machine code.

Addressing uses a base register/offset pair, where the offset can be 0-4095 (#FFF). Adding instructions in the middle of the code can cause addressability errors, if the code block grows longer than 4095 bytes, causing a need for an extra base register. This is a common porgramming scenario, but one that causes a lot of headcahe, so you want to avoid it as much as possible.

One solution is to jump to subroutines, assuming you have enough room (in your 4095 bytes) for the extra jumps (not always a probable assumption given how tight assembly code usually is). However, in order to do this jump you use at least two registers (the base address of the subroutine, and the return address). Alas, in many cases you have important data in all 15 registers. In order to call the subroutine you need to store them to memory, and restore them. This requires instructions, that may themselves break addressability. But even if you can afford to add these instructions, you need memory for the registers (which must by allocated dynamically, if your code is reentrant), and enough extra code that's nopt worth the hassle.

Notice that I am not implying that testing isn't important or useful. Only that the techniques used may be quite different (e.g., taking memory snapshots/dumps, using debugers instead of producing output etc.).

Frank Atanassow - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 3:52:35 AM (reads: 1432, responses: 1)
While there are some contracts that can't be enforced at compile-time, nothing stops a compiler to check contracts. Spark does something in these lines, AFAICT.

"Some"? More like almost all.

Nothing stops my Haskell compiler from rejecting unproductive non-terminating programs, optimally strictifying, or producing optimal machine code either, yet it doesn't happen in practice.

These sorts of problems are all undecidable, and equivalent to general theorem-proving. You cannot rely on them, and are better off IMO if you don't.

Marc Hamann - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 5:37:25 AM (reads: 1462, responses: 0)
The only thing a unit test can tell you is that your code is broken. It can *NOT* tell you that it is *correct*.

I think you have some major misconceptions about TDD. It is primarily a design technique: it allows/encourages you to design an application incrementally based on smaller sets of functionality at a time. The tests have several functions, which include: - to allow you to be explicit about your design and expectations - to encourage you to write testable code (generally more modular) - to give you the confidence to make major design improvements late in the game (since you can tell immediately if you have broken something with your change)

This is just scratches the surface.

As for "correct", do you mean simply "guaranteed to terminate" or "guranteed to do what it is supposed to"? If the latter (which is the main concern with software reliability) what techniques do you propose to use?

Though I can't think what practical things these would be, I would imagine most could be done in addition to TDD.

Vesa Karvonen - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 5:59:30 AM (reads: 1521, responses: 1)
I guess I didn't make mysefl clear, so let me try to be more concrete.

No, based on your second post I understood your first post perfectly well. It is just that I do not agree with you. I think that you are wrong. I will also try to be more concrete.

I used ASM/370 quite a lot, so I'll talk about this specific assembly language/machine code.

I have no experience with ASM/370, but I have many years of assembly language programming experience on the Motorola MC680x0 series and the Intel 80x86 series. I also have some assembly language programming experience on other CPU architectures (e.g. the Hitachi SH4 I mentioned earlier).

Addressing uses a base register/offset pair, where the offset can be 0-4095 (#FFF). Adding instructions in the middle of the code can cause addressability errors, if the code block grows longer than 4095 bytes, [...]

Like I said, I have no ASM/370 experience, but I have never programmed on a CPU that didn't have some limitations like this. On every CPU I have programmed on, there have been simple ways of avoiding such problems. As a concrete example, with a simple section/segment declaration you can usually make sure that the test code is stored in a different section/segment (area of memory) and thus will not interfere with the layout of the code that is being tested. As another concrete example, one must realize that the test code can usually use longer [sequences of] instructions to call the procedures to be tested, because the size of the test code is not that important (you can drop the test code during linking whenever you want.)

Alas, in many cases you have important data in all 15 registers. In order to call the subroutine you need to store them to memory, and restore them. [...] you need memory for the registers (which must by allocated dynamically, if your code is reentrant), and enough extra code that's nopt worth the hassle.

Register pressure is nothing new to assembly language programmers.

Does the ASM/370 have a stack (or does the OS have conventions to use a particular register as the stack register)? That is where stuff is usually stored accross procedure calls. It is typically easy to push registers to the stack and later pop the data back to registers. (Another concrete alternative is to setup a stack yourself simply because it can make things so much easier.)

Novice assembly language programmers may not realize it, but a common programming idiom in assembly language is to use consistent calling conventions for procedures. The small call overhead imposed by the use of calling conventions is rarely significant. In cases where it is significant it is usually better to use, for example, macros to inline the code and avoid the use of procedures. Most good assemblers (macro assemblers) have macros of some sort.

Notice that I am not implying that testing isn't important or useful. Only that the techniques used may be quite different (e.g., taking memory snapshots/dumps, using debugers instead of producing output etc.).

I would not imply that debugging is a testing technique. I think that testing and debugging are completely different kinds of activities with completely different kinds of long term effects. Also, if you take the idea of taking memory snapshots to its logical conclusion, you'll end up with some sort of automated regression testing.

...

Honestly, nothing you have said has convinced me that it would be intrinsically so difficult to write test code in assembly language, due to language/machine limitations, that it would make sense to avoid it at all cost. I do agree that, like in any language, the programmer may not have the talent to organize his programmming well enough to make the testing simple. DFT (Design For Testability) is just something that most programmers are not born with.

Also, I realize that it may (or may not - I don't know) be the case that the assembly language programming you have done was a long time ago and you have learned a great deal more (almost everything) about programming (about controlling complexity) after that. If that is the case I can certainly understand why you feel that assembly language programming is so difficult that you wouldn't even want to write tests.

Ehud Lamm - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 6:15:20 AM (reads: 1532, responses: 0)
It was only a couple of years ago. I knew quite a bit then, and I know quite a bit now.

By the way, I am aa pretty good assembler programmer...

Chris Rathman - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 7:44:02 AM (reads: 1417, responses: 0)
Not that this disagrees with the principles of TDD...

When I did real-time systems with certain parts of the code in assembly for responsiveness, I found that the tests themselves could change the timing of the interrupt subroutines and the coprocessor synchronization.

My take would be that assembly language is many times used in environments where timing is the critical factor. Built in tests have to be careful to not alter that timing or they end up skewing the thing being tested. It's for this reason that Logic Analyzers sometimes make better testing platforms for software that is close to the metal.

Daniel Yokomiso - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 12:50:21 PM (reads: 1430, responses: 0)
"Some"? More like almost all.

Nothing stops my Haskell compiler from rejecting unproductive non-terminating programs, optimally strictifying, or producing optimal machine code either, yet it doesn't happen in practice.

If your language only accepts contracts that are verifiable by its type system then all contracts could be enforced at compile-time. Perhaps my wording was inexact, what I was trying to say is: "While in languages that support DbC, like Eiffel, allow the writing of some contracts that can't be enforced at compile-time (because they allow undecidable and/or state-changing predicates), a compiler could check the contracts that are verifiable and report 'Contract Error!' at compile-time. Spark does something in these lines, AFAICT."

These sorts of problems are all undecidable, and equivalent to general theorem-proving. You cannot rely on them, and are better off IMO if you don't.
I don't see how this remark fits in our discussion:

Tayssir John Gabbour said:
TDD is less appropriate to Design by Contract languages or anything else that does work at compiletime?
You answered:
DBC does not do work at compile-time.
I replied:
While there are some contracts that can't be enforced at compile-time, nothing stops a compiler to check contracts. Spark does something in these lines, AFAICT.

So now I don't know what is your point anymore. Is DbC runtime only , compile-time only or what? I view contracts as an extension to the type-system, with some runtime support but with most of its predicates being verifiable by the compiler. By your answer you indicated that DbC is runtime only, because IMO DBC does not do work at compile-time. implies that no part of DbC works at compile-time, so I tried to show an evidence of a language where some contracts are verified at compile-time. Did we changed the subject?

Stuart Allie - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 4:27:42 PM (reads: 1390, responses: 1)
I wrote: The only thing a unit test can tell you is that your code is broken. It can *NOT* tell you that it is *correct*.

Marc responded: I think you have some major misconceptions about TDD. It is primarily a design technique: it allows/encourages you to design an application incrementally based on smaller sets of functionality at a time. The tests have several functions, which include: - to allow you to be explicit about your design and expectations - to encourage you to write testable code (generally more modular) - to give you the confidence to make major design improvements late in the game (since you can tell immediately if you have broken something with your change)

This is just scratches the surface.

As for "correct", do you mean simply "guaranteed to terminate" or "guranteed to do what it is supposed to"? If the latter (which is the main concern with software reliability) what techniques do you propose to use?

Though I can't think what practical things these would be, I would imagine most could be done in addition to TDD.

For "correct" read "proven to meet specifications" as well as "guaranteed to terminate". Anything less is incomplete or broken.

I don't know how to always build correct software. I can do it sometimes by being vey careful, thinking long and hard, and doing formal proofs where needed. I don't know where the future for provably correct programming lies, but I think we are many years away from reliable computing.

Perhaps I do misunderstand TDD, but I really don't see the benefits as a design tool. If your small sets of functionality do not have 100% test coverage then the tests are of little value anyway. It would be better to prove the small sets correct and then you know you can rely on them. How does knowing that your code passes an incomplete test help you design the overall system? It sounds like a false or at least fragile sense of security to me.

To me, TDD seems like a lot of work for little benefit. Knowing that a program passes all the tests doesn't tell me very much. As I said, you can write a program that passes all the test but does nothing else. And writing tests that provide 100% coverage (equivalent to a guarantee of correctness) is never going to happen.

I really just don't see the fuss about something that can only provide a very small benefit.

I'd be happy to be wrong, and I *will* go away and do some more reading on TDD to see if I am wrong.

Patrick Logan - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 6:16:47 PM (reads: 1377, responses: 0)
Perhaps I do misunderstand TDD, but I really don't see the benefits as a design tool.

In Lisp, the read-eval-print loop is a design tool for small incremental improvements to a program. Over time, the transcript accumulates a history that can be edited into regression test cases for the code you developed.

In TDD, the test suite is a similar accumulation. Running the test suite over and over with small incremental improvements is like running the Lisp interpreter. xUnit is a command line interpreter with some additional disciplinary support that you would apply yourself if you were using a command line interpreter.

Patrick Logan - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 6:23:46 PM (reads: 1376, responses: 0)
If your small sets of functionality do not have 100% test coverage then the tests are of little value anyway.

Do you mean 100% code coverage?

Considering that you can:

* Add small tests incrementally.
* Add only enough code to get keep the tests running.
* Add only tests that correspond to a feature your customer wants.
Then you might conclude:
* You have the right tests for your customer.
* You have just enough code but not too much.
* You have roughly 100% coverage of what your customer wants.

Patrick Logan - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 6:28:12 PM (reads: 1361, responses: 0)
I *will* go away and do some more reading on TDD to see if I am wrong.

Better yet, go away and do some testing. See for yourself. Find someone who lives near you, and do some pair testing.

This will be much more instructive than just reading about someone else's experience. Maybe it isn't for you, maybe it is.

Try to find out.

Stuart Allie - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 8:32:59 PM (reads: 1368, responses: 0)
Patrick wrote: Do you mean 100% code coverage?

I mean that the tests cover 100% of the possible cases - ie. they constitute a proof that the code *works*, not just that it gives the right answer for a *particular* input. As I said, a lookup table can pass the tests without doing anything useful.

Considering that you can:

* Add small tests incrementally. * Add only enough code to get keep the tests running. * Add only tests that correspond to a feature your customer wants. And here's the problem again. Just because you have added code that passes the tests *doesn't* mean that you have correctly implemented a given feature. Unless the test covers all possible cases. Nowhere have I seen any indication that unit test writers have it drummed into them that they *must* cover *all* possibilities for their tests to be useful.

Then you might conclude: * You have the right tests for your customer. * You have just enough code but not too much. * You have roughly 100% coverage of what your customer wants. But you *only* have code that *passes the tests*. Not code that implements the spec.

IF (and that's a very big "if") unit tests are being written to cover *all* cases, then I agree that unit testing is very useful. I've seen no indication that that is the case. In fact, quite the opposite. The unit test examples I have seen provide very little information about program correctness, except for particular input values. And that's, to my mind, *worse* than useless because it provde a false sense of security.

It is useful to know when your program is broken, and unit tests can tell you that. But the *absence* of a failed test tells you *ABSOLUTELY NOTHING* about the program. Passing all unit tests provides no information other than "this program passes these tests." It does *not* imply "this program does what it is specified to do." And this point is not made in the testing advocacy I have seen. It's like looking through a pinhole at the world and saying "well, i can't see anything dangerous out there, so it must be safe". It's a basic logical error.

Patrick Logan - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 9:25:44 PM (reads: 1373, responses: 0)
Just because you have added code that passes the tests *doesn't* mean that you have correctly implemented a given feature. Unless the test covers all possible cases.

In TDD the phrase "all possible tests" means "all tests the customer cares about". There are a few levels of testing:

1.a. Enough unit tests to design the software.
1.b. Enough unit tests to complete the software.
2.a. Enough acceptance tests to satisfy the customer.
2.b. Enough acceptance tests to complete the software.

The "complete the software" tasks are where you have to bring your experience as a developer/tester/QA person to determine you have enough tests.

Passing all unit tests provides no information other than "this program passes these tests." It does *not* imply "this program does what it is specified to do."

The specification is something that evolves in an on-going exchange between the development team and the customer. This can be as formal as the team and the customer desires for any given situation.

Curt Sampson - Re: Udell: Test before you leap into development  blueArrow
8/7/2003; 11:15:42 PM (reads: 1361, responses: 1)
Stuart, I think you are unhappy with the concept of TDD because you're aiming at a different goal than many other software developers. You want to produce a provably correct program, which is a noble goal. But where I work, we have a limited budget, so we have to achieve a balance between the amount we spend on testing and the amount we spend on everything else. So for each test, we ask, "how likely is it to catch a problem with our system, relative to the cost of writing the test?" We write enough tests to give us reasonable confidence that the system is working, and leave it at that. The reason for this is that our customer would rather have two features, each with a 99% chance that they are working correctly than one feature with a 100% chance it's working correctly.

And in my experience, having broad unit test coverage, even if it's far from 100% coverage, makes a huge, huge difference in how you handle the code. I freely do massive rototills that four years ago, before I'd started doing unit testing, I would never have even contemplated.

The best way to learn what TDD is like is probably not to read about it, but to pair up with a programmer experienced with it. I suspect it's really hard to get a sense from books of what it feels like to do TDD.

cjs

Vesa Karvonen - Re: Udell: Test before you leap into development  blueArrow
8/8/2003; 3:48:12 AM (reads: 1380, responses: 0)
I don't know how to always build correct software.

Fair enough. Neither do I.

I can do it sometimes by being vey careful, thinking long and hard, and doing formal proofs where needed.

Are your proofs automated? What I mean is that if you happen to need to change the behaviour of some part of the program, then do you need to do those proofs again?

The nice thing about many forms of testing is that they are easy to automate with our current knowledge of logic. I don't know how or what you program, but much of my programming is, and has been, about changing existing programs (code that I wrote earlier, or code that someone else wrote earlier) or iterative and incremental design of new programs.

Contrary to what one may believe, well written tests can and do reveal many kinds of bugs. I have certainly benefitted from my tests many times as I have changed a program in some way and a test has proven that the change broke something.

...

I think that it is a myth that programmers would believe that finite testing, in general, would prove correctness. I certainly don't know any programmer (not a single one) that would think in that way. Many of those programmers still use testing, because testing helps them to catch many kinds of errors more economically than formal proofs of correctness.

Vesa Karvonen - Re: Udell: Test before you leap into development  blueArrow
8/8/2003; 3:53:38 AM (reads: 1375, responses: 0)
The reason for this is that our customer would rather have two features, each with a 99% chance that they are working correctly than one feature with a 100% chance it's working correctly.

The way you talk above sounds like you have actually asked it from your customer. Well, honestly, have you?

Adewale Oshineye - Re: Udell: Test before you leap into development  blueArrow
8/10/2003; 1:23:56 AM (reads: 1237, responses: 1)
Stuart's point that a sufficiently large look-up table would be capable of passing unit tests is a good one. It's something that's bothered me on occasion. However we have integration and acceptance/functional tests that the system must pass and a lookup table won't be enough to satisfy these.

100% test coverage is a lot easier to achieve when using TDD but it's not a panacea. Brian Marick has a pair of articles (http://www.testing.com/writings/coverage.pdf and http://www.testing.com/writings/classic/mistakes.html) pointing out that even with 100% coverage you haven't proved your code is correct only that every line can be executed with certain sets of data without causing a system failure.

Ehud Lamm - Re: Udell: Test before you leap into development  blueArrow
8/10/2003; 1:30:15 AM (reads: 1277, responses: 0)
I think it is much better to think of TDD as a specification technique, not a testing method.

Unit tests are a form of informal (and incpomplete) yet executable specifications. Since most projects use informal and not executable specifications, if they use any kind of specification technique at all, TDD should be seen as a step in the right direction. Same goes for DbC, which indeed relies on run time checks, instead of relying on formal proofs. It's still better than the techniques most projects are using daily.

Frank Atanassow - Re: Udell: Test before you leap into development  blueArrow
8/10/2003; 8:23:35 AM (reads: 1234, responses: 0)
Daniel wrote: a compiler could check the contracts that are verifiable and report 'Contract Error!' at compile-time

In principle, for a tiny class of procedures designed by contract. The term "design by contract" implies that the contract provides a specification of the procedure. And automated correctness testing is undecidable in general.

I wrote: These sorts of problems are all undecidable, and equivalent to general theorem-proving. You cannot rely on them, and are better off IMO if you don't. Daniel wrote: I don't see how this remark fits in our discussion:

OK, my point was simply that DbC is overwhelmingly a dynamic technique.

I compared it with optimization because program optimization is an example of a static analysis which is undecidable. Almost all compilers do it, of course, with mixed results. I brought it up because many researchers believed that functional languages would eventually outperform conventional languages because they are `amenable to static analysis'. And yet that dream has never been realized for a serious language.

But an even better comparison is with static type systems. They are logics specifically designed to be statically verifiable, given a proof (the program). And yet DbC logics are far more powerful, and do not provide proofs/certificates.

In comparison to that, the idea that DbC (as it stands) can be an effective static technique is hopelessly naive.

Patrick Logan - Re: Udell: Test before you leap into development  blueArrow
8/10/2003; 8:20:14 PM (reads: 1202, responses: 2)
I think it is much better to think of TDD as a specification technique, not a testing method.

Tests make poor specifications since they are not nearly as concise.

This is why I say TDD and DBC are complementary. With each test, decide if the contract has changed. If so, restate it.

DBC provides the concise specification. Tests excercise the code and therefore the specification.

Ehud Lamm - Re: Udell: Test before you leap into development  blueArrow
8/11/2003; 12:09:12 AM (reads: 1224, responses: 0)
DBC contracts are also not very expressive. But both methods are better than nothing.

Marc Hamann - Re: Udell: Test before you leap into development  blueArrow
8/11/2003; 6:49:46 AM (reads: 1218, responses: 0)
Tests make poor specifications since they are not nearly as concise.

Compared to what? If you mean compared to, say, a specification done in Z, I would agree.

Since most projects don't use Z or anything like it, but rather somewhat more fuzzy and verbose natural language specs, I'm not sure well-written and concise test-code does so badly in comparison.

Carlos Scheidegger - Re: Udell: Test before you leap into development  blueArrow
8/11/2003; 9:20:27 AM (reads: 1167, responses: 1)
First of all, hello everyone. I'm a long time lurker here, and a lowly compsci undergrad :). I'm not a native English speaker, so I apologize in advance for the sure to come grammatical errors...

I have a (hopefully not dumb) question concerning TDD for you guys. How do you think systems like QuickCheck go along with TDD and DbC? I think the ability to express properties of the function and test them "off-line", so to speak, is a good compromise between DbC and TDD. The advantages I see is that you're able to think about the properties without actually writing the tests themselves. I've seen code being developed "around the tests", so to speak. I know this is exactly what TDD is about, but I'm concerned with the quality of the tests. Testing explicitly for properties of the function seems to mitigate this somewhat.

I see two main problems. The first, covered extensively in the QuickCheck paper, is the generation of satisfactory random data. The second, and I think worse, is dealing with state. Haskell, of course, does not have this problem (outside of IO, at least). I think this could be worked around by stating the outside state in the properties and having the system set this state explicitly. This would look like a "lifting" of the impure function.

So, what do you guys think? (And sorry for the longish post at this point in the thread)

Ehud Lamm - Re: Udell: Test before you leap into development  blueArrow
8/11/2003; 9:28:25 AM (reads: 1190, responses: 0)
Good questions, but I'll let others answer as I am going out in a few minutes. But to give some perspective here a quote from the QuickCheck chapter in The fun of programming

QuickCheck finds errors very effectively, and helps us tighten up specifications, by identifying forgotten preconditions and invariants, for example. The specification then becomes a form of computer-checked documentation, with strong evidence (if not absolute guarantee) of correctness.

Patrick Logan - Re: Udell: Test before you leap into development  blueArrow
8/11/2003; 9:43:02 AM (reads: 1176, responses: 0)
DBC contracts are also not very expressive. But both methods are better than nothing.

You are right. DBC is really a first step toward writing specifications for programmers who otherwise would not even take that step.

Vesa Karvonen - Re: Udell: Test before you leap into development  blueArrow
8/11/2003; 11:32:43 AM (reads: 1155, responses: 0)
Let's assume that we are working in a pure language, and we need to specify the preconditions and postcondition for an implementation of an arbitrary recursive function f.

Proposition: It is possible to make the precondition such that it is true if and only if f(x) is defined. Let's call a precondition that satisfies this proposition perfect.

Proof: Because the function f is recursive, there is a Turing machine Mf that decides it. A valid implementation of the precondition is therefore a program that simulates Mf with input x and is true iff Mf accepts x and false iff Mf rejects x.

Proposition: It is possible to make the postcondition such that it is true if and only if the value returned by the implementation of f for x is f(x). Let's call a postcondition that satisfies this proposition perfect.

Proof: Implement the postcondition so that it compares the result of Mf with the result of the implementation of f.

So, intuitively, it seems to me that there is very little that DBC can not express. (I'm sure you can extend the above propositions to recursively enumerable functions and semidecidable Turing machines in a natural way.)

Corollary: If all preconditions and postconditions of a (pure and recursive) program are proven perfect, the program will either only terminate with a correct result or will terminate with a statement that a precondition or a postcondition was violated or will not terminate at all.

Proof: Let's assume that the program would produce an incorrect result. This means that at some point an implementation of a function must produce an incorrect result. This contradicts the assumption that all preconditions and postconditions were perfect, because a perfect postcondition would detect the incorrect result and would terminate the execution of the program. Thus the assumption that the program would produce an incorrect result must be false.

I think that this is not a weak result. It is a very desirable property - probably the next best thing to proving that a program is correct. For example, if this property would be true of all programs, we could rest assured that any knowledge we have, that is based on the results computed by some program, would be correct.

Also note that the above only proves that a perfect precondition and a perfect postcondition always exists (for a recursive function f). It doesn't mean that they should be implemented in terms of Mf. Furthermore, note that it is possible to reuse perfect preconditions and postconditions. You could, for example, accumulate a large library of perfect preconditions and postconditions reducing the amount of proofs that you would need to do for any particular program.

Ehud Lamm: DBC contracts are also not very expressive.

In light of my reasoning above, I claim that DBC is expressive enough for proving interesting properties of programs.

Frank Atanassow: the idea that DbC (as it stands) can be an effective static technique is hopelessly naive.

You are correct that DBC is not powerful enough for proving that a program is correct. However, as I have reasoned above, DBC is powerful enough for proving that a program will not produce incorrect results. Therefore I claim that it is not hopelessly naive to think that DBC can be an effective static technique.