Software Engineering

Interview with Donald Knuth

Nice surprise on my way to work this morning with an NPR story on Donald Knuth on the radio this morning. (Favorite quote: We are competing against ignorance).

OOP Is Much Better in Theory Than in Practice

An critique of OOP. The article is about OOP as a SE/design approach and doesn't directly attack the issue from a PL angle, but it might still interest LtU readers.

From a PL point of view, I would have chnaged the title to: OOP Is Much Better in Theory Than in Practice, (And the Theory Isn't too Good anyway).

Ian Bicking: The challenge of metaprogramming

So I think it's really important that we approach metaprogramming with caution. I think Guido has been right to resist macros. Not because they are necessarily wrong, but because we haven't yet figured out how to do them right. And maybe we never will, maybe source code simply isn't the right level for abstractions. I think it's good that Python doesn't do tail-call elimination; it seems like a nice feature, but in subtle ways it makes the language harder. And I think continuations could lead to bad things. There are wrong paths on the road to higher-level programming. (Though in some ways I also feel the opposite: tell us not to shoot ourselves in the foot, and expect us to listen, don't assume we'll abuse every feature that exists; there's always a tension in these design choices.)

This deserves more attention than I can give it right now, but I am sure others here will want to comment.

SPARKAda

A SPARK program has a precise meaning which is unaffected by the choice of Ada compiler and can never be erroneous.

From the examples in the chapter, I thought it looked surprisingly simple to use - comparable to adding contracts in DbC, for example. I guess the analysis requires a little more effort?

This has been mentioned only in passing by Ehud so I hope it's worth a post of its own. And I'm amused by the idea that Ada is a sloppy, ill-defined language... ;o)

Use Continuations to Develop Complex Web Applications

An introductory article from IBM developerWorks on Cocoon, continuation-based (sometimes called "modal") web applications, and such.

If you've ever developed a non-trivial Web application, you know that development complexity is increased by the fact that Web browsers allow users to follow arbitrary navigation paths through the application. No matter where the user navigates, the onus is on you, the developer, to keep track of the possible interactions and ensure that your application works correctly. While the traditional MVC approach does allow you to handle these cases, there are other options available to help resolve application complexity. Developer and frequent developerWorks contributor Abhijit Belapurkar walks you through a continuations-based alternative that could simplify your Web application development efforts.

via comp.lang.scheme

Integrating support for undo with exception handling

MSR: Integrating support for undo with exception handling. Avraham Shinnar; David Tarditi; Mark Plesko; Bjarne Steensgaard. December 2004.

One of the important tasks of exception handling is to restore program state and invariants. Studies suggest that this is often done incorrectly. We introduce a new language construct that integrates automated memory recovery with exception handling. When an exception occurs, memory can be automatically restored to its previous state. We also provide a mechanism for applications to extend the automatic recovery mechanism with callbacks for restoring the state of external resources. We describe a logging-based implementation and evaluate its effect on performance. The implementation imposes no overhead on parts of the code that do not make use of this feature.

The authors propose a try_all construct that restores program state, and analyze its semantics. They implemented the proposed construct using Bartok, a research compiler and runtime system for CIL.

I guess some here would see all the work required to implement this construct and the issues it raises as a demonstration of the perils of state...

Grady Booch: Microsoft and Domain Specific Languages

Grady Booch's contribution to the discussion on UML vs. DSLs.

Along the way we learn about UML specialization mechanisms, UML profiles, and Grady's opinions as regards tool vs. language issues.

RDF Elevator Pitch

Eureka, the perfect RDF introduction with thanks to A.M. Kuchling (amk). Nothing beats crayon-colored diagrams. It is short, sweet, and hits the main points precisely, including 'political' issues at the end. Much W3C advocacy makes the Semantic Web sound too futuristic....The RDF Core spec is hard to read and really boring....Introductory tutorials are few....Simple things can be done without much effort, and can still be useful.

On one island are the semantic web folks. On another island are semantic filesystem folks. A summit seems in order. I don't hear much about the two working together, but then I live on yet another island. RDF+ReiserFS looks like a match made in heaven, for example, Reiser4 uses dancing trees, which obsolete the balanced tree algorithms used in databases...Do you want a million files in a directory, and want to create them fast? No problem.

From the article,

Reiser has "substantial plans" for adding new kinds of semantics to ReiserFS to help it challenge Microsoft's efforts. "We're planning on competing with the Longhorn filesystem," he says.

The new ReiserFS will eschew the relational algebra approach and work with semistructured data. "The person entering data can employ [the] structure inherent in the data rather than forcing a structure," Reiser said, adding, "Flexibility in querying and creating data is our target. [This] will stand in contrast to Microsoft's SQL-based approach, which does not have that flexibility."

Jon Udell: interview with Ward Cunningham and Jack Greenfield

Jon Udell's interview with Ward Cunningham and Jack Greenfield might help understand Microsoft's methodology of software factories and DSLs.

The interview is available as a 54 minute MP3 file. The notion of language as abstraction mechanism and explanation of the part played by DSLs appear towards the second half of the conversation.

RDF and Databases

Some RDF research dropped me to a nice paper (PDF) from IBM discussing RDF with relational databases. This combination can replace half-baked application data mechanisms. These crop up regularly in my consulting work. Think nested directories of Windows INI files and brittle, binary files breaking on minor design iterations. The pain, the pain.

Someone should describe RDF in 500 words or less as a generalization of INI. That note would spread understanding of RDF, which is simple but often described so abstractly that it seems complicated. It's better to start from the known and move to the unknown.

Here is a short attempt, just to spark interest. Experts may call me all wet. Windows INI format uses "key-equals-value," with keys grouped into sections. Think of "key-equals-value" as a special case of RDF's "subject-predicate-object." RDF generalizes to any verb, not just "equals," along with superior grouping. While INI nests just one level down (via sections), RDF URIs handle arbitrary nesting (via slashes), and URIs also permit remote data. That is not to say RDF data must be tree-structured. Most RDF papers focus too much on XML. XML is merely one expression syntax. There are several others and a relational database will store RDF data in its own way, completely independent of XML.

There are several projects in this domain. My favorite so far is OpenRDF Sesame. It supports querying at the semantic level. It seems more mature than others, having derived from previous efforts, and works with both PostgreSQL and MySQL as well as Oracle. An abstraction layer called SAIL makes Sesame database-agnostic. Sesame even sports a stand-alone b-tree system, or in-memory operation, if you don't want an external database. I like PostgreSQL much better than MySQL for its loose BSD license and technical merits. Apropos of that, another bit of news is that PostgreSQL now works natively on Windows. (The PostgreSQL client has always worked natively as a DLL.) PostgreSQL speed issues mentioned in Sesame papers have improved. As for Sesame, the only drawback is Java. But since Sesame interfaces over TCP through Java servlets, that's a don't-care.

On a related note, I looked into Python-based Chandler. The story there is that it's a custom job because, says Andi Vajda, When I started working on this project in May, the repository was late, very late, and the project was stalled because of that. I felt I could get something usable for the project to resume much faster if I started a data model implementation from scratch and persisted it using Sleepycat's Berkeley dbxml and Berkeley db. Today, the Chandler repository is not really so much an object database as an item XML database combined with large collections of references directly stored in Berkeley DB. Hmm...project behind, so build from scratch? I'm not clear why Chandler didn't go with RDF, but it sounds like project management problems. It seems as though RDF would support all that Chandler wants to do without the constrictions of XML. Note that Sesame has Python bindings.

XML feed