Fabian Pascal on XQuery

I think Pascal's style of argumentation is a model of how not to do it ("it" in this case being the remedy of ignorance and the construction of informed consensus - or the setting of terms for informed debate), but be that as it may - he makes a few good points here:

If you liked SQL, you'll love XQuery

Among the good points that he makes are these:

  • If XML's point is to be language-independent, why an XML-specific language?
  • If XML's point is to be database-independent, why reinvent the data management wheel (and, we shall argue, a "square wheel" at that?)
  • If XML is for syntactic interchange, can it be used for semantic data management?

Pascal's contention is that XQuery's language design cannot achieve its goals because of the designers' neglect of fundamentals (by which he means the Word of Codd). What are the foundations of XQuery? Is Pascal's preference for Codd's relational model over models based on graph theory as dogmatic as it sounds?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Pascal's Pensees on XML

Is Pascal's preference for Codd's relational model over models based on graph theory as dogmatic as it sounds?

Pretty much, yes. C.J. Date and Pascal have been singing this song for a while (the relational model is the perfect data model, why has everyone screwed it up, why does everyone neglect it, etc.)

On the other hand, they do make some interesting points about the shortcomings of XML.

Shocking!

Dominic: I think Pascal's style of argumentation is a model of how not to do it

Heh, when I first started reading Pascal's diatribes I had the uncomfortable feeling that I was reading my own posts. :) I really understand their frustration, but I agree that they could pursue more fruitful methods to achieve their goals.

However I generally agree with Pascal's and Date's points, inasmuch as I can understand them, being something of a database outsider.

Marc: the relational model is the perfect data model

I don't think they're saying this. They're saying that, of the known data models, it is the best because it's the most general, the best understood, the most `complete' and the simplest. See the `desirable properties of a data model' at the end of Tags Do Not a Language Make - Part 1.

(I would make a similar argument in favor of typed functional languages.)

why has everyone screwed it up, why does everyone neglect it

Yeah, there is probably too much political content in Pascal's essays. I have also developed some opinions about why static and dynamic typing are so misunderstood, and occasionally considered whether it might be best to air them (and probably sometimes let them slip), but on the whole I think I'm not qualified to psychoanalyze programmers or analyze the industry or its history.

On the other hand, they do make some interesting points about the shortcomings of XML.

Yes, particularly the bit about syntax versus semantics, which echoes my own feelings.

OTOH, as W3C standards go, XQuery is a step in the right direction in terms of implementation. What I mean is, even though I don't like the design goals of XQuery or see a need for it or most of its features, the way in which they are meeting those criteria is better than usual; for example, there is a formal semantics.

However, I'm a bit mystified about the way they associate hierarchical models with graph theory. I don't know exactly what a hierarchical database model is, but I would associate it with the theory of trees. And I imagine you can achieve the same level of expressiveness as a relational model by quotienting a hierarchical model with some equations. (Would they call that a relational model again, then?) Probably they are as ill-informed about PLT as I am about DB theory, though, so I guess this is more due to miscommunication than genuine error.

Poor Public Relations

They're saying that, of the known data models, it is the best because it's the most general, the best understood, the most `complete' and the simplest.

I agree that this is what they are saying, and I would even agree that it is true for general purpose data problems, but I think that what they miss is that sometimes you don't need all that generality or completeness, and as a result there is a simpler solution for a specific problem.

To the extent that XML is touted as the cure-all for data transfer and storage problems, there criticism is on target. But I think they sometimes throw the baby out with the bath water; sometimes an annotated tree is just what you want.

However, I'm a bit mystified about the way they associate hierarchical models with graph theory. I don't know exactly what a hierarchical database model is, but I would associate it with the theory of trees.

I think this may simply be imprecision on their part. A tree can of course be construed as a directed acyclic graph, and I think this is what they mean, though of course this ignores a whole universe of interesting graphs.

XQuery is a step in the right direction in terms of implementation.

If Wadler works on something I take it seriously. ;-)
However, my impression from reading his XML related papers is that he basically feels that XML sucks, but he sees it as a challenge to provide formalisms and tools that make it suck less. ;-)

Relations and graphs

Another thing I get hung up on is that they dismiss graph theory as being hopelessly complex, in contrast to relation(al) theory, but a(n unlabeled, directed) graph is precisely a binary relation.

But now that I think about it, somewhere else Date poo-poo'd binary relations (versus n-ary ones) as being too restrictive or unnatural or something (which, however, I find unconvincing).

Graph angst

Another thing I get hung up on is that they dismiss graph theory as being hopelessly complex

They would probably soil themselves if they saw a category-theoretic data formalism. ;-)

But now that I think about it, somewhere else Date poo-poo'd binary relations (versus n-ary ones) as being too restrictive or unnatural or something (which, however, I find unconvincing).

I think they would have had a harder time convincing themselves that their formalism was "easier" if you had to build tuples and tuple relations up from scratch.
(Nothing that would deter a category theorist or someone used to Haskell functional types, of course. ;-) )

Trees / graphs

A Unix filesystem hierarchy with symlinks isn't even necessarily acyclic. Symlinks are a fairly good example of where the hierarchical model starts to run into problems - sooner or later, somebody's going to need them, and they introduce all kinds of complexity that a model based on simple trees won't have figured out how to manage. Similar thing happens with inheritance graphs in OO.

Slashdot discussion

Poor even by slashdot's standards, but here it is...