archives

RDF and Databases

Some RDF research dropped me to a nice paper (PDF) from IBM discussing RDF with relational databases. This combination can replace half-baked application data mechanisms. These crop up regularly in my consulting work. Think nested directories of Windows INI files and brittle, binary files breaking on minor design iterations. The pain, the pain.

Someone should describe RDF in 500 words or less as a generalization of INI. That note would spread understanding of RDF, which is simple but often described so abstractly that it seems complicated. It's better to start from the known and move to the unknown.

Here is a short attempt, just to spark interest. Experts may call me all wet. Windows INI format uses "key-equals-value," with keys grouped into sections. Think of "key-equals-value" as a special case of RDF's "subject-predicate-object." RDF generalizes to any verb, not just "equals," along with superior grouping. While INI nests just one level down (via sections), RDF URIs handle arbitrary nesting (via slashes), and URIs also permit remote data. That is not to say RDF data must be tree-structured. Most RDF papers focus too much on XML. XML is merely one expression syntax. There are several others and a relational database will store RDF data in its own way, completely independent of XML.

There are several projects in this domain. My favorite so far is OpenRDF Sesame. It supports querying at the semantic level. It seems more mature than others, having derived from previous efforts, and works with both PostgreSQL and MySQL as well as Oracle. An abstraction layer called SAIL makes Sesame database-agnostic. Sesame even sports a stand-alone b-tree system, or in-memory operation, if you don't want an external database. I like PostgreSQL much better than MySQL for its loose BSD license and technical merits. Apropos of that, another bit of news is that PostgreSQL now works natively on Windows. (The PostgreSQL client has always worked natively as a DLL.) PostgreSQL speed issues mentioned in Sesame papers have improved. As for Sesame, the only drawback is Java. But since Sesame interfaces over TCP through Java servlets, that's a don't-care.

On a related note, I looked into Python-based Chandler. The story there is that it's a custom job because, says Andi Vajda, When I started working on this project in May, the repository was late, very late, and the project was stalled because of that. I felt I could get something usable for the project to resume much faster if I started a data model implementation from scratch and persisted it using Sleepycat's Berkeley dbxml and Berkeley db. Today, the Chandler repository is not really so much an object database as an item XML database combined with large collections of references directly stored in Berkeley DB. Hmm...project behind, so build from scratch? I'm not clear why Chandler didn't go with RDF, but it sounds like project management problems. It seems as though RDF would support all that Chandler wants to do without the constrictions of XML. Note that Sesame has Python bindings.

PLaSM - functional language for computing with geometry

I found this looking in Eclipse plug-ins:

PLaSM, the Programming LAnguage for Symbolic Modeling
is a ``design language", developed by the CAD Group at the Universities of Roma ``La Sapienza" and ``Roma Tre" [PS92,PPV95].

The language is strongly inFLuenced by FL. With few sintactical differences, it can be considered a geometric extension of a FL subset.

Also, the current PLaSM implementation is open-source and based on PLT Scheme.