XML

Status of XQuery in the .NET Framework 2.0

The official Microsoft statement,

Microsoft has decided not to ship a client-side XQuery implementation in the final version of .NET 2.0 Framework ("Whidbey")....

However, we will be shipping a subset of XQuery in SQL Server ("Yukon") 2005. The reason for this is the new native XML datatype in SQL Server 2005. Microsoft recommends XQuery as the way to re-shape, query, and modify XML data within SQL Server 2005.

Reminds me of this discussion ;-)

XPath, XML, Python

A discussion with some interesting views and perspective.

Here you'll find code snipptes showing the API of the various libraries (all of them used to solve the same programming task), and comments about the behaviour of each of the libraries.

I am not sure if this item should be categorized as being pro-libraries, pro-a big standard libary, or pro-builtin language support for XML.

You decide.

Introducing Comega

O'Reilly has an article, Introducing Comega, which covers some of the basic features of Cω: streams, "choice" and "nullable" types, anonymous structs and syntax support for XPath and relational query constructs.

I begin to see the point of language integration for XML processing, although the thought of using XML for general data storage and management still gives me the shivers...

Links (Wadler)

Wonder what Wadler is up to?
My latest research interest is a programming language for web application development, building on my experience with XML, Java, and Haskell.

A short introduction and a set of slides are available (both PDF).

Wonder about the language name?

A quarter of a century ago, Burstall and others at Edinburgh introduced an influential programming language, Hope, named after Hope Park Square, located near the University on the Meadows. This note proposes a research effort to design a new programming language for the web, Links, named after the Brunts-field Links, located at the other end of the the Meadows and site of the world’s first public golf course.

And here's why you should be interested in this work,

Other languages for web programming include Xtatic from Pierce, Scala from Odersky, and Xen and Cw from Microsoft. Links will benefit from fruitful interactions with these efforts. However, Links differs crucially in that it adopts database ideas from Kleisli and systems principles from Erlang, taking it well beyond the capabilities of these other languages.

XQuery and XSLT as declarative languages

More from Michael Rys.

Rys lists several advantages of declarative languages, some of which apply to functional languages in general.

It seems I've become the editor responsible for XML related links around here lately, even though we have on board editors who are much more qualified for this than I am ;-).

XQuery in Relational Database Systems

Michael Rys reports on his experience in XML 2004 and links to his paper and presentation,

The presentation will outline how XQuery and XML conceptually fit into a relational database environment. It will cover the organization of the XML in the database, how to type it using W3C XML Schema, how to query it both in conjunction with SQL and using top-level XQuery. It will present the concepts, talk about new developments in the ISO/ANSI SQL/XML standards and present some demos of XQuery in the upcoming Microsoft SQL Server 2005.

RDF Elevator Pitch

Eureka, the perfect RDF introduction with thanks to A.M. Kuchling (amk). Nothing beats crayon-colored diagrams. It is short, sweet, and hits the main points precisely, including 'political' issues at the end. Much W3C advocacy makes the Semantic Web sound too futuristic....The RDF Core spec is hard to read and really boring....Introductory tutorials are few....Simple things can be done without much effort, and can still be useful.

On one island are the semantic web folks. On another island are semantic filesystem folks. A summit seems in order. I don't hear much about the two working together, but then I live on yet another island. RDF+ReiserFS looks like a match made in heaven, for example, Reiser4 uses dancing trees, which obsolete the balanced tree algorithms used in databases...Do you want a million files in a directory, and want to create them fast? No problem.

From the article,

Reiser has "substantial plans" for adding new kinds of semantics to ReiserFS to help it challenge Microsoft's efforts. "We're planning on competing with the Longhorn filesystem," he says.

The new ReiserFS will eschew the relational algebra approach and work with semistructured data. "The person entering data can employ [the] structure inherent in the data rather than forcing a structure," Reiser said, adding, "Flexibility in querying and creating data is our target. [This] will stand in contrast to Microsoft's SQL-based approach, which does not have that flexibility."

RDF and Databases

Some RDF research dropped me to a nice paper (PDF) from IBM discussing RDF with relational databases. This combination can replace half-baked application data mechanisms. These crop up regularly in my consulting work. Think nested directories of Windows INI files and brittle, binary files breaking on minor design iterations. The pain, the pain.

Someone should describe RDF in 500 words or less as a generalization of INI. That note would spread understanding of RDF, which is simple but often described so abstractly that it seems complicated. It's better to start from the known and move to the unknown.

Here is a short attempt, just to spark interest. Experts may call me all wet. Windows INI format uses "key-equals-value," with keys grouped into sections. Think of "key-equals-value" as a special case of RDF's "subject-predicate-object." RDF generalizes to any verb, not just "equals," along with superior grouping. While INI nests just one level down (via sections), RDF URIs handle arbitrary nesting (via slashes), and URIs also permit remote data. That is not to say RDF data must be tree-structured. Most RDF papers focus too much on XML. XML is merely one expression syntax. There are several others and a relational database will store RDF data in its own way, completely independent of XML.

There are several projects in this domain. My favorite so far is OpenRDF Sesame. It supports querying at the semantic level. It seems more mature than others, having derived from previous efforts, and works with both PostgreSQL and MySQL as well as Oracle. An abstraction layer called SAIL makes Sesame database-agnostic. Sesame even sports a stand-alone b-tree system, or in-memory operation, if you don't want an external database. I like PostgreSQL much better than MySQL for its loose BSD license and technical merits. Apropos of that, another bit of news is that PostgreSQL now works natively on Windows. (The PostgreSQL client has always worked natively as a DLL.) PostgreSQL speed issues mentioned in Sesame papers have improved. As for Sesame, the only drawback is Java. But since Sesame interfaces over TCP through Java servlets, that's a don't-care.

On a related note, I looked into Python-based Chandler. The story there is that it's a custom job because, says Andi Vajda, When I started working on this project in May, the repository was late, very late, and the project was stalled because of that. I felt I could get something usable for the project to resume much faster if I started a data model implementation from scratch and persisted it using Sleepycat's Berkeley dbxml and Berkeley db. Today, the Chandler repository is not really so much an object database as an item XML database combined with large collections of references directly stored in Berkeley DB. Hmm...project behind, so build from scratch? I'm not clear why Chandler didn't go with RDF, but it sounds like project management problems. It seems as though RDF would support all that Chandler wants to do without the constrictions of XML. Note that Sesame has Python bindings.

The Xtatic experience

Vladimir Gapeyev, Michael Y. Levin, Benjamin C. Pierce, and Alan Schmitt. The Xtatic experience. Technical Report MS-CIS-04-24, University of Pennsylvania, October 2004.

The aim of the present paper is to discuss Xtatic - less formally and more holistically - from the perspective of language design. We survey the most significant issues we faced in the design process and evaluate the choices we have made in addressing them.

As you'd expect a lot of time is spent explaining the design decisions concerning the type system.

More Xtatic papers here.

Haskell Communities and Activities Report, Seventh Edition, November 2004

The November 2004 edition of the biannual Haskell Communities and Activities Report has been published. Lots of new stuff in the last six months, and some old stuff updated as well. The HC&AR has been steadily growing over the last three years, showing that FP is gaining users both professional and private.

Several of the HC&AR items are interesting enough to have their own LtU stories, which may appear shortly.

XML feed