Lambda the Ultimate

inactiveTopic Design Paradigms
started 1/9/2002; 12:48:32 PM - last post 1/14/2002; 1:22:36 PM
Brent Fulgham - Design Paradigms  blueArrow
1/9/2002; 12:48:32 PM (reads: 823, responses: 6)
Henry Petroski's excellent book "Design Paradigms" examines case histories in engineering (in this case, Structural Engineering) as a means of highlighting common flaws in the design process.

While the specific stories in Petroski's book are about bridge failures and other Civil works, the underlying themes and analysis are applicable to Software Engineering as well. This can perhaps be best illustrated in this quote from the introduction:

"...the state of the art is often only a superficial manifestation, arrived at principally through analytical and calculational tools, of what is understood about the substance and behavior of the products of engineering. Anyone who doubts this assertion need only look at the design errors and failures that occur in the climate of confidence, if not hubris, known as the state of the art."

The book also reports comments by the National Science Foundation on design failures, in which it recommended that past design failures be studied by more engineers, since "... in many cases the same errors are repeated again and again."

While Petroski may lament that fact that more "brick and mortar" engineers do not study past failures to understand future dangers, we in the software community would seem to be in even worse shape. While our more traditional friends in other branches of engineering have case histories they can refer to, there is often very little written or discussed about software failures.

Furthermore, what little literature does exist on the topic of failed software projects asserts that these have all been management failures. One might be led to believe that if it were not for failed managers all software projects would be resounding successess -- something that most of us would agree is patently untrue.

So, I propose that we as engineers, scholars, and students go about the collection of cases histories in software engineering failures so that we might learn from the mistakes of others. Certainly Fred Brook's Mythical Man-Month is an excellent start. But perhaps we could establish a web log such as this one that we could use to collect our own sets of stories -- what was the original design intent? What was the result? What went wrong? What lessons might we take away from this.

I anxiously await your comments!

Ehud Lamm - Re: Design Paradigms  blueArrow
1/9/2002; 1:13:05 PM (reads: 861, responses: 0)
Check out RISKS.

BTW, the connection to programming languages is far from trivial, though it is clear the programming languages impact reliability.

Noel Welsh - Re: Design Paradigms  blueArrow
1/10/2002; 4:06:22 AM (reads: 857, responses: 3)
It's a difficult problem to address because of social and technical factors. In my experience when software projects go wrong one slings blame around, finds a scapegoat, then scuttles off to watch the blood-letting from a safe distance. Perhaps this is because, unlike civil engineering firms, programmers are often employees of those who want programs developed and so the programmer is not so concerned with protecting their reputation in the way a civil engineering firm is.

Technical problems are complex and often difficult to pin down. For example, I've spent a number of weeks in a profilers trying to improve system performance. Unable to improve performace despite increasing the 'hotspot' speed X5 I concluded that the profiler overhead was so high the reported characteristics did not represent the real operating characteristics of the system. I never did get that system running (significantly) faster. In this sort of situation it is difficult to determine what went wrong. I suspect there was some sort of thread bottleneck, but without the time and resources to properly investigate I may as well have been reading tea-leaves to find the problem.

So in conclusion I don't think the industry is set up to document its problems. All the famous SE bloopers I can think of (Ariane, THERAC) all had an external body demanding and explanation. That typically isn't the case. Until SE's are accountable to the same extent other Engineers are it will be more convenient to swept failures under the carpet and continue with business as usual.

Frank Atanassow - Re: Design Paradigms  blueArrow
1/10/2002; 4:54:26 AM (reads: 913, responses: 2)
Technical problems are complex and often difficult to pin down.

This reminds me of something Carl Sagan said once in a lecture of his which I was fortunate enough to catch. I wish I had an exact quote, but the gist of it was that it's not that "hard" sciences are more difficult than "soft" sciences, but rather quite the opposite. This is why physics, biology and mathematics have made massive progress over the past couple thousand years, while topics such as psychology and economics, for whom it is much harder to formulate unambiguous and testable hypotheses, have made very little progress, or, indeed, have only become "sciences" recently, and rigorous ones only since the adoption of statistical methods. In other words, the "soft" sciences are the ones which grapple with problems which are much more difficult to formulate and solve, due to lack of the right tools.

So, I think that technical problems are exactly the ones which are easiest to address and examine.

Furthermore, I think programmers have it much easier than civil engineers in the sense that it doesn't cost you anything to "run an experiment" on your computer, whereas running an experiment in real life can be expensive, time-consuming and even dangerous.

Of course, I agree with the claim that the industry is not set up to document its problems. The state of the programming industry now is still something like the American Wild West (we even have our own Myth of the Rugged Pioneer/Gunslinger, namely the Hacker), and everybody wants to do Their Own Thing. But this will inevitably change...

Noel Welsh - Re: Design Paradigms  blueArrow
1/14/2002; 6:08:14 AM (reads: 931, responses: 1)
Frank, at first I was going to respond saying that technical problems aren't easy, you often can't run the experiment because they system requires 3 computers and a day to set up correctly etc. Then I thought about it and realised that the only reason we have these technical problems is because there are deeper social problems (like poor communication, blame shifting) that stop us from addressing the technical problems.

If software is developed in an XP style you should never reach the end of a 3/6/12/24 month development cycle and discover that you've developed a load of crud. IMHO XP is a social methodology in that it relies on creating the right social environment for the team to function effectively. Social factors like peer group pressure and an internal sense of professionalism dominate over production of dead trees. Contrast with technical methodologies that Require Unreasonable Paperwork. They are often easier to implement because you don't have to get people to do the right thing, you just have to get them to appear to do the right thing (fill out the correct forms, produces loads of documentation that never gets read). It's obvious with style of methodology I think is better!

I think there is a trend towards accepting and dealing with the social problems. Unfortunately they're hard problems and hackers tend to be naturally antisocial. Are we wildly off topic yet? :)

Ehud Lamm - Re: Design Paradigms  blueArrow
1/14/2002; 6:24:06 AM (reads: 976, responses: 0)
Are we wildly off topic yet? :)

Yep

To bring this discussion closer to PL, let me just say that from what I've seen XP (extreme programming) is applicable the most when using OO high level languages. Examples? Many classic refactorings are about inheritance hierarchies; low level code take longer to write so it much harder to have daily builds etc.

It is interesting to note that many XP advocate (e.g., Robert Martin) are in favor of dynamic typing coupled with unit testing, instead of strong static typing.

Brent Fulgham - Re: Design Paradigms  blueArrow
1/14/2002; 1:22:36 PM (reads: 831, responses: 0)
It's true that the Software Engineering community will eventually mature and develop metrics like other engineering disciplines. However, this will only in a timely fashion if we in the community educate management in the needs for post-mortems. I don't see that currently happening in industry.

Noel, I'm not sure the difficulties you faced with your profiling example are much different than those in the brick-and-mortar community when tasked with discovering the cause of a bridge failure. As you point out, discovering the underlying cause for a fault is simply a matter of resources.

Project failure post mortems are rare in the old engineering world as well. However, there seem to be a greater number of people in those disciplines who have seen the relative merit in understanding failures and their causes and using them to avoid future mishaps.

I think an effort that would pay off in the long run is to create a "literature review" or even just a central clearinghouse for anecdotal evidence about what kinds of problems people have had in their development efforts. This kind of data would over the long run allow Software Engineers to develop some of the same "instincts" about good and bad designs.