Lambda the Ultimate

Unifying Tables, Objects and Documents
started 4/29/2003; 3:02:58 AM - last post 5/11/2003; 8:13:12 AM

Dan Shappir - Unifying Tables, Objects and Documents

4/29/2003; 3:02:58 AM (reads: 2487, responses: 17)

Unifying Tables, Objects and Documents

This paper proposes a number of type-system and language extensions to natively support relational and hierarchical data within a statically typed object-oriented setting. In our approach SQL tables and XML documents become first class citizens that benefit from the full range of features available in a modern programming language like C# or Java. This allows objects, tables and documents to be constructed, loaded, passed, transformed, updated, and queried in a unified and type-safe manner.

I've yet to read this article but it looks interesting.
Posted to xml by Dan Shappir on 4/29/03; 3:03:22 AM

Martin Bravenboer - Re: Unifying Tables, Objects and Documents

4/29/2003; 4:04:10 AM (reads: 1831, responses: 0)

There is already a topic on this paper. Weird that nobody did post a response there btw ...

Cimarron Taylor - Re: Unifying Tables, Objects and Documents

4/29/2003; 8:17:12 AM (reads: 1774, responses: 0)

Unfortunately in the Pokemon table on page 5, Zubat is listed as a "Plant" Pokemon. This is incorrect. It should be "Poison/Flying".

Brent Fulgham - Re: Unifying Tables, Objects and Documents

4/29/2003; 8:37:33 AM (reads: 1753, responses: 0)

While the paper is interesting, the authors make an annoying and uncompelling "argument from experience" that embedding a DSL to handle interoperation with SQL or XML "does not scale well." (On the positive side they do reference our Very Own Noel Welsh's SchemeQL/SchemeUnit paper.)

What does this mean? I would find the paper far more compelling were they to include some example or datum to support this assertion. In my experience, peanut butter is a pretty wretched foodstuff. That doesn't make it so!

Granted, the topic of the paper is the authors' work to extend the C# language with constructs that natively support XML and SQL manipulation. But isn't that really just creating a DSL? I can envision a set of macros and data types added to CL or Dylan that would largely support the work described in the paper. Have I "grown the language", or did I create a DSL?

Isaac Gouy - Re: Unifying Tables, Objects and Documents

4/29/2003; 9:02:46 AM (reads: 1764, responses: 0)

Weird that nobody did post a response...
Indeed. Maybe if it had been titled C# 2.0 - Lists, Tuples, Anonymous Functions, Folds,... there would have been more reaction? ;-)

Should we be thinking again about Todd Proebsting's "disruptive technlogies"?

Neel Krishnaswami - Re: Unifying Tables, Objects and Documents

4/30/2003; 6:24:29 AM (reads: 1584, responses: 0)

I think that they are actually on to something when they say embedded DSLs don't scale well. Most macro systems let you add new syntax, but they don't let you extend the language at the lexical level. (Common Lisp, as always, is an exception to this rule.) This is extremely useful for being able to invent very terse, expressive sublanguages -- imagine trying to write a macro for printf-style format strings using either Scheme or Dylan's macro languages. Moreover, you could use this to directly embed XML or SQL expressions in your program (which would expand to a set of library calls, of course).

I think that if you combine lexical macros with static types you have the potential to create some /amazing/ DSLs. For example, using phantom types you can ensure that your HTML template language can *only* produced valid HTML, for example, or your SQL embedding can guarantee that the values you get out are statically typesafe. All the pieces we need to make amazing new programming languages already exist -- we just need to put them together.

Frank Atanassow - Re: Unifying Tables, Objects and Documents

4/30/2003; 8:55:08 AM (reads: 1556, responses: 0)

using phantom types you can ensure that your HTML template language can *only* produced valid HTML

Phantom types are not necessary, and AFAIK give no advantage here. I have a paper in the works where we translate (a sizable subset of) XML Schema into Haskell types and there are no phantom types, except in one place where they make sense, namely to distinguish element types with different names but identical structure.

Phantom types are, IMO, often used in situations where non-phantom types can serve even better.

Neel Krishnaswami - Re: Unifying Tables, Objects and Documents

4/30/2003; 6:31:18 PM (reads: 1505, responses: 0)

How do you do that? If you try to use an ordinary set of ML algebraic datatypes to describe HTML I find that the tagging and untagging overhead gets to be really cumbersome. Do you perhaps use recursive types?

Ziv Caspi - Re: Unifying Tables, Objects and Documents

4/30/2003; 11:50:49 PM (reads: 1470, responses: 0)

This is extremely useful for being able to invent very terse, expressive sublanguages -- imagine trying to write a macro for printf-style format strings using either Scheme or Dylan's macro languages.

Terse, expressive, sublanguages might result in scalability issues on their own, especially when you have a large development group. In large groups is it sometimes essential that all (or most) of Dev and Test be able to identify/fix problems when they occur -- having only a few people fluent in your embedded language is a drawback in such cases.

Neel Krishnaswami - Re: Unifying Tables, Objects and Documents

5/1/2003; 5:05:18 AM (reads: 1466, responses: 0)

Ziv, the basic layout model and structuring model in most GUI toolkits and in HTML is basically the same. Build a tree of widgets, and let the layout engine display them on the screen. And yet -- literally millions more people can code up an HTML page than can build a GUI form. I think that difference comes from straight the fact that nesting relationships in HTML are visually obvious, and not at all so in most GUI libraries. A well-designed sublanguage can mean that a whole bunch of your test issues go away because the developers get it right in the first place. Also, sublanguages mean that bugs get localized. You don't need to look at thousands of code expansions of a design pattern; you can just look at the macro definition. This is the same reason that functions make testing, despite the fact that your programmers are introducing operators that the test group have never seen before.

A language is an interface, just like every other API, and you can and should manage it the same way.

Frank Atanassow - Re: Unifying Tables, Objects and Documents

5/1/2003; 6:11:51 AM (reads: 1446, responses: 0)

How do you do that? If you try to use an ordinary set of ML algebraic datatypes to describe HTML I find that the tagging and untagging overhead gets to be really cumbersome.

It is really cumbersome, in Haskell. This is one of the situations where a language with iso-inference, as I described in another forum, would be a big win.

But the translation is mostly intended for use with Generic Haskell, to write typesafe `Schema-aware' applications, i.e., applications which work for any Schema, but still take advantage of the Schema. (For example, an XML compressor can take advantage of the Schema to improve performance.) In such generic programs, the overhead of isos is almost negligible, since you only treat each case once.

Do you perhaps use recursive types?

You mean equirecursive types? No, they are an abomination. :)

Neel Krishnaswami - Re: Unifying Tables, Objects and Documents

5/1/2003; 2:59:58 PM (reads: 1431, responses: 0)

If by equirecursive types you mean what you get when you turn off the occurs check in the unifier, then yes. I can never keep equi- and iso- recursive types straight.

Ziv Caspi - Re: Unifying Tables, Objects and Documents

5/2/2003; 1:04:59 AM (reads: 1384, responses: 0)

Neel,

I totally agree that no API is ever going to beat a language (or sub-language) that has been designed to carry out the same task. Consider regulat expressions in Perl vs. (say) in .NET. The latter's API is very reasonable, well-designed, and clean. It has sufficient "power" to carry out all the tasks I've ever needed. And yet, it doesn't come close to the accessibility of Perl's integrated support.

That said, every such language carries a price tag with it. First you have to notice that you're dealing with an embedded language, which might not do what you thought it would do. Then you have to learn its syntax. This is an overhead which is completely justified if you're dealing with the language on a daily basis. It might not otherwise.

Frank Atanassow - Re: Unifying Tables, Objects and Documents

5/2/2003; 3:49:33 AM (reads: 1392, responses: 0)

If by equirecursive types you mean what you get when you turn off the occurs check in the unifier, then yes. I can never keep equi- and iso- recursive types straight.

Yes, that's what I meant.

Neel Krishnaswami - Re: Unifying Tables, Objects and Documents

5/2/2003; 8:46:03 AM (reads: 1379, responses: 0)

Ziv: I think there are a lot of developers who spend every day working with XML, SQL, and GUIs. To me, it just makes sense to try and make their lives as much easier as we can, and the integration as close, clean, and robust as we can.

The extensions to the conventional type systems that this paper proposes are fairly modest, which is why I think it's cool. Small ideas that yield big wins are the best and most elegant kind.

But that also means that it's their other idea -- of making the in-language representations match the documents and queries that your program is actually using -- the most fruitful direction for further investigation. There's surprisingly little work (for example) on building integrated scanner/parser systems, which you would need to mix sublanguages with different lexical conventions. (This is especially important if you want a template language that doesn't suck, for example.)

Martin Bravenboer - Re: Unifying Tables, Objects and Documents

5/2/2003; 11:50:41 AM (reads: 1378, responses: 1)

There's surprisingly little work (for example) on building integrated scanner/parser systems, which you would need to mix sublanguages with different lexical conventions.

You might be very interested in Scannerless Generalized LR parsing. Stratego uses SGLR to support Meta Programming with Concrete Object Syntax.

Actually Erik Meijer (the author of this paper) has a tip of the day on his homepage: Why did I ever mess around with parser combinators? Scannerless GLR (or Earley) is the way of the future!

Matt Hellige - Re: Unifying Tables, Objects and Documents

5/2/2003; 12:51:08 PM (reads: 1451, responses: 0)

Actually Erik Meijer (the author of this paper) has a tip of the day on his homepage: Why did I ever mess around with parser combinators? Scannerless GLR (or Earley) is the way of the future!

For more on this and related issues in extensible language design, you might be interested in Donovan Kolbly's PhD dissertation. I'm not entirely sure I actually like the idea of langauges being quite that syntactically (and lexically) extensible, but it's certainly neat, and it definitely got me interested in Earley parsing...

Ehud Lamm - Re: Unifying Tables, Objects and Documents

5/11/2003; 8:13:12 AM (reads: 1195, responses: 0)

It seems the paper was not accepted for OOPSLA 2003.