Lambda the Ultimate

inactiveTopic Tim Bray: The History of RDF
started 5/22/2003; 2:39:10 AM - last post 5/25/2003; 3:19:15 PM
Ehud Lamm - Tim Bray: The History of RDF  blueArrow
5/22/2003; 2:39:10 AM (reads: 1587, responses: 12)
Tim Bray: The History of RDF
Sometime in the mid-nineties, a guy named Ramanathan V. Guha (everyone calls him Guha because they're scared of “Ramanathan,” go figure) went to work for Apple Computer. He cooked up this metadata format named Meta Content Framework (MCF), and then they built a cool app called Hotsauce, driven by MCF, that represented web sites as little 3-D planetary systems and you could fly around them...

I like to point out from time to time (i.e., whenever I get the chance) that utilizing XML is all about language design. This essay provides some lessons culled Tim's experience with RDF. One of the major lessons Tim discusses, It's the Syntax, Stupid!, is likely to cause concern for some LtU regulars...


Posted to xml by Ehud Lamm on 5/22/03; 2:40:54 AM

Paul Snively - Re: Tim Bray: The History of RDF  blueArrow
5/22/2003; 10:15:23 AM (reads: 926, responses: 1)
I'm reminded of a recent conversation with a programmer whom I respect very much, in which he expressed his displeasure with the phrase "syntactic sugar," which struck him as arrogant and dismissive in the extreme. His point also was that "syntax matters." I had to spend some time thinking about why he and I would have such different reactions to the phrase, and what I eventually concluded was that I perceive the expression as referring to syntax that adds nothing to the conversation, if you will ("Syntactic sugar causes cancer of the semicolon." — Alan Perlis) whereas my friend had the sense that the phrase was more generally applied than that, and perhaps it is, in practice.

I think Tim Bray's point is well-taken. It certainly should be since, as Tim reminds us, he helped to create the very syntax that he's criticizing. In the beginning of the definition of the XML serialization of RDF, I don't know that there was an adequate appreciation of the impedance mismatch between the graph-structured RDF data model and the tree-structured syntactic model of XML, and so I suspect that the mismatch manifests itself in that serialization. Frankly, I don't find the alternatives compelling, mostly because they punt on the graph structure and rely on a simple (simplistic?) representation of triples. While I understand that this is still just a serialization format, meaning that the graph structure will be reconstituted when the serialization is parsed, I find that the stated goal of readability of the serialization is not met since this vital aspect of the RDF data model is obfuscated by a triple-based representation.

So it seems to me that we're left with the argument that the alternative syntaxes can be read and written by hand. I also don't find this compelling, and the Dreamweaver/HTML analogy seems lacking: we're not talking about presentation of human bite-sized pieces of information that fit in a screen or two; we're talking about arbitrary amounts of machine-readable metadata. If we must confine ourselves only to what a single human being can hold in his/her head, or structures that can be trivially represented as trees, we radically reduce the scope of applicability of RDF. The challenge is to come up with the spreadsheet, if you will, of RDF: perhaps ground-breaking technology when it is first invented, but something that we all now take for granted to such an extent that if someone pulled out an actual, physical spreadsheet and started writing equations on it, we would quite rightly look at them funny. That's all the tool-builders want, and all the angst about "waiting for tools" and "complexity" seems misguided. Computers are here to shoulder the burden of complexity in a variety of application domains; why should structured metadata authoring be considered out of bounds? The fact that we don't currently know what such a "spreadsheet" should look like reflects the novelty of the domain, not the unworthiness of the goal.

Patrick Logan - Re: Tim Bray: The History of RDF  blueArrow
5/22/2003; 12:14:54 PM (reads: 902, responses: 0)
Paul writes: The fact that we don't currently know what such a "spreadsheet" should look like reflects the novelty of the domain, not the unworthiness of the goal.

Does it also say something about the urgency of the goal?

Ehud Lamm - Re: Tim Bray: The History of RDF  blueArrow
5/23/2003; 1:14:57 AM (reads: 830, responses: 0)
Syntactic sugar is, obviously, not the same as all syntax. The term is used for syntax that can be easily and locally transformed to more basic forms in the language.

It also worthwhile to keep in mind that deciding that syntax form carries no useful semantic information depends, often enough, on implementation intuition. Is let syntactic sugar over lambda? The transformation is trivial enough. However, using let can provide a valuable hint to the interpreter.

Marius Amado Alves - Re: Tim Bray: The History of RDF  blueArrow
5/23/2003; 3:02:16 AM (reads: 793, responses: 0)
XML is a marketing monster and a technical monstruosity.

Technically, it utterly fails its dual purpose of human and machine readability. Nobody is going to write XML by hand. On the machine front, it is both time and spatially inefficient.

On the market, XML is a train nobody can afford to loose. So even developers like myself that are aware of its technical deficiencies have to make sure their products talk it. Fortunately we can limit the extent of XML in our products to import/export facilities, and use technically sound formats internally.

For writing RDF textually, here's a couple of languages that surpass XML by lightyears:

- Prolog, and maybe just its fragment Datalog

- Dot, a language for annotated graphs, associated with GraphViz from AT&T Labs (www.graphviz.org)

For writing RDF visually, I've been looking for such a tool for ages, and since I seem to be a really bad surfer, I'd appreciate suggestions.

Kimberley Burchett - Re: Tim Bray: The History of RDF  blueArrow
5/23/2003; 9:25:25 AM (reads: 755, responses: 0)
At endeca, we routinely write xml by hand. We also have a GUI tool that can generate the same xml, but we're always adding new features and there's a time lag between when the feature is implemented, and when it's in the GUI. So for those cases, we have to edit xml by hand.

However, I agree that other syntaxes might be better.

Paul Snively - Re: Tim Bray: The History of RDF  blueArrow
5/23/2003; 12:03:56 PM (reads: 738, responses: 0)
Marius Amado Alves: For writing RDF visually, I've been looking for such a tool for ages, and since I seem to be a really bad surfer, I'd appreciate suggestions.

IsaViz? Interestingly, it uses dot under the hood.

Ehud Lamm - Re: Tim Bray: The History of RDF  blueArrow
5/24/2003; 6:13:53 AM (reads: 705, responses: 1)
Dave Winer and Sjoerd (here and here) both have interesting comments about Tim Bray's original post.

LtU is indeed horribly slow the last couple of days. Usually the site is much more responsive.

andrew cooke - Re: Tim Bray: The History of RDF  blueArrow
5/24/2003; 12:07:33 PM (reads: 701, responses: 0)
Winer does have a point (about the problem being that there's nothing new or interesting). On one "ontology" mailing list I was subscribed to they spent weeks arguing whether to use XML or not - it gave the distinct impression that there's nothing more interesting happening (otoh, the w3c rdf-interest list is sometimes worth a read).

I admit I've missed large chunks, but I don't ever remember LtU being fast (at least after the first few months).

Michael Stevens - Re: Tim Bray: The History of RDF  blueArrow
5/25/2003; 1:57:23 AM (reads: 655, responses: 1)
On a purely practical level, I write an awful lot of XML by hand, and I think most people I know do the same thing, there not being any better tools we're aware of.

A smaller, but still interesting, proportion of people will generate XML from programs with some variant on 'print "<foo type="$bar">$baz</foo>"....' (code is Perl, principle applies to many languages).

Marius Amado Alves - Re: Tim Bray: The History of RDF  blueArrow
5/25/2003; 3:34:42 AM (reads: 646, responses: 0)
RDF/XML needs many additions to become readable/writeable. One of these things is certainly a 'use' statment.

I find RDF M&S revisited : From Reificationn to Nesting, from Containers to Lists, from Dialect to pure XML by Conen et al. one of the best papers in SWWS'01, The First Semantic Web Symposium. Conen et al. describe an extended RDF model and syntax. They use two syntaxes really. One is XML. The other has this grammar:

  R ::= r | '(' R ',' R ')' | '[' R ',' R ',' R ']'

to which they add a lot of syntactic sugar, e.g. commas are dropped, nested lists are flatten.

Example (section 3):

  [Gustaf says [Ecki likes (Reinhold Wolfram)]

This is expressed in 'flat' statements via reification and containers:

  [Gustaf says r1]
  [r1 rdf:type rdf:Statement]
  [r1 rdf:subject Ecki]
  [r1 rdf:predicate likes]
  [r1 rdf:object s1]
  [l1 rdf:type Seq]
  [l1 rdf:_1 Reinhold]
  [l1 rdf:_2 Wolfram]

An use statement would improved this already-better-than-XML syntax:

  [use rdf]
  [Gustaf says r1]
  [r1 type Statement]
  [r1 subject Ecki]
  [r1 predicate likes]
  [r1 object s1]
  [l1 type Seq]
  [l1 1 Reinhold]
  [l1 2 Wolfram]

Note the use statement is my proposal, not Conen at al.'s (nor anyone else's as far as I know, so maybe I should run to the patent office now...) Note I also changed _1 to 1. Obviously the integral numbers are not rdf names! Unnecessary complication is a plague in XML and RDF, and I suspect one of the things putting people off.

andrew cooke - Re: Tim Bray: The History of RDF  blueArrow
5/25/2003; 7:41:17 AM (reads: 665, responses: 0)
for writing xml, the emacs sgml mode is quite useful.

Oleg - Re: Tim Bray: The History of RDF  blueArrow
5/25/2003; 3:19:15 PM (reads: 629, responses: 0)
If I may remark, it's possible to deal with relations in their natural form (e.g., as triples) and then mechanically convert them to RDF. There is no need to code RDF by hand. The following article shows a real example:
http://pobox.com/~oleg/ftp/Scheme/xml.html#SXML-as-database

I have used this approach to prepare the submission of an OMF (weather observation markup format) into the DISA repository. The important part of the submission was the description of all the elements, attributes and documents -- in a format that closely resembles RDF. For some reason MITRE, which designed the repository, did not like RDF itself. When I mentioned RDF on one of the meetings, I was told that the RDF Recommendation was 'rushed' and was called 'Sir'.

In more detail, the example is described in Section 5 of
http://pobox.com/~oleg/ftp/papers/SXs.pdf

The reverse transformation, parsing of namespace-rich RDF documents (in particular, DAML ontologies) is also possible:
http://pobox.com/~oleg/ftp/Scheme/daml-parse-unparse.scm

It's trivial to convert the parsed ontology into a set of triples. OTH, we can use SXPath on the SXML tree directly.