Lambda the Ultimate

inactiveTopic Extensible Programming for the 21st Century
started 5/12/2004; 6:33:00 AM - last post 5/20/2004; 1:21:39 PM
Ehud Lamm - Extensible Programming for the 21st Century  blueArrow
5/12/2004; 6:33:00 AM (reads: 6710, responses: 9)
Extensible Programming for the 21st Century
The issues Gregory Wilson discusses in this article come up on LtU fairly often. Do we want extensible syntax? Should XML be used? Should we continue to store programs as text files? Etc.

The discussion seems balanced and reasonable, but I am sure some people here (you know who you are...) are going to find the conclusions objectionable. That's why we have a discussion group...


Posted to general by Ehud Lamm on 5/12/04; 6:33:41 AM

Dominic Fox - Re: Extensible Programming for the 21st Century  blueArrow
5/12/2004; 6:52:48 AM (reads: 614, responses: 0)

Besides XSLT, which I don't think is especially fit for that purpose anyway, what part of the standard XML toolset might one use to process an XML representation of a program's abstract syntax, and to what advantage?

I'm not sure it's that much easier to parse an XML document than to parse a text file in some other format for which a grammar exists. Parsing isn't finished when you've obtained some representation of the infoset for an XML document, any more than it is finished when the lexer has broken a "flat" text file up into tokens.

Chris Rathman - Re: Extensible Programming for the 21st Century  blueArrow
5/12/2004; 8:38:11 AM (reads: 563, responses: 1)
I'm not sure it's that much easier to parse an XML document than to parse a text file in some other format for which a grammar exists.
Prior to XML, it was nigh impossible to parse code. XML makes these things not only easy but also enjoyable. :-)

(wonders how anything ever got accomplished b/4 xml arrived?)

Dominic Fox - Re: Extensible Programming for the 21st Century  blueArrow
5/12/2004; 8:51:46 AM (reads: 548, responses: 0)

As it happens, I'm writing a C# library for manipulating RDF at this very moment. I can already construct RDF graphs programmatically, and serialize them to N-triples or RDF/XML, so the main thing left to do is the RDF/XML parser. Given that .Net has an "XML Parser" built-in, you'd think I wouldn't have any work left to do...wouldn't you...?

Ehud Lamm - Re: Extensible Programming for the 21st Century  blueArrow
5/12/2004; 11:06:08 AM (reads: 494, responses: 0)
To date, no viable apporach has surfaced for mitigating the problems associated with multiple language dialects. Isn't this the core issue?

Dominic Fox - Re: Extensible Programming for the 21st Century  blueArrow
5/12/2004; 12:49:57 PM (reads: 466, responses: 0)

One of the more promising ideas in the article is that an XML-based representation of source code would facilitate embedding of other kinds of content: SVG class diagrams, MathML illustrations of formulae, "literate" commentary and so forth. This is already possible in a large number of ways using a large number of tools, of course; it's a bit rich to say that XML holds the promise of finally realizing Don Knuth's "dream" of literate programming when CWeb (and more recently Leo) have been realizing it for years. But it's true that by mixing other dialects in with an AST-level view of code one could provide more granular program meta-information than you get with JavaDoc, say, or C# attributes. The question is, when is it needed and what is it needed for?

Mark Evans - Re: Extensible Programming for the 21st Century  blueArrow
5/12/2004; 12:57:23 PM (reads: 461, responses: 0)

The best way to bridge this gap---indeed, the only way---is to let programmers control how linkers, debuggers, and other tools handle generated code.

Plug-in APIs are nice, but the story needs an intermediate compiler language to really come together, especially on this debugging problem, which the author considers primary. An IL refactors the tool chain. It breaks down monolithic compilers and offers a natural insertion point for plug-ins (if not the only one). An IL provides natural scaffolding for the debugger and linker. Programmers could even hand-edit the IL now and then.

In the case of numerical languages like Mathematica and MATLAB, using XML for storage would allow programmers to put real mathematical notation directly into their source files.

His prior remarks had me thinking about Mathematica as a case in point even before this direct mention. It is a full programming language, not just a math toolkit. Actually, Mathematica does XML extremely well, and the company even drove development of MathML. It already puts "real mathematical notation" and cool vector graphics directly in source files. The XML argument is not what got me thinking, though.

Mathematica bears interesting resemblances to Lisp and Scheme. Code is data, everything is an S-expression. Mathematica's S-expressions are somewhat more rational, even if overall syntax has little else to commend it. I don't get hung up on syntax, though. That's why I seldom use Lisp or Scheme, in fact - I have Mathematica. Mathematica has a much stronger model/view paradigm. In fact it's strong enough to factor out two application programs, the "kernel" (model) and the "front end" (view). There are even commands to change the view, e.g. "StandardForm," "TraditionalForm," "FullForm." I very much like the model/view notion for source code.

Mathematica is a beautiful but proprietary system. The interesting aspect is how compiler control folds into the system. A "Compile" function turns S-expressions into faster IL (which is its own S-expression, and editable). However, the style of well written Mathematica code is highly functional (not imperative), so it relies on inference capabilities to select appropriate numerical algorithms and data layouts. Hence the compile function is rarely used. Some implementation details are allowed to peek through for fine-tuning, such as the packed array and sparse array directives and associated conversions.

Its "pluggability" comes from MathLink, which is just S-expressions over a wire. The kernel may be treated as a "component" of other programs, or may use other components (i.e. C/C++ functions). It is possible to write your own front end. Kernels can share computational burdens. There is a language subset for GUI development in the front end. The J/Link software that ships with Mathematica is a Java implementation of these concepts. I have used the C equivalents many times and find the development cycle highly productive.

Mathematica does a good job of meeting the author's criteria, at least for a proprietary offering that can only open itself to the extent that valid business considerations allow. Where Mathematica breaks down is debugging, which is often hard and involves tedious manual parsing of output. Still, it's one of my all-time favorite tools.

My main critique of this paper is that it falls a bit hard for the fallacy that XML syntax solves everything. As the author notes,

Language extensibility has been around for years, but is still largely an academic curiosity. Three things stand in the way of its adoption: programmers' ignorance, the absence of support in mainstream languages, and the cognitive gap between what programmers write, and what they have to debug.

The basic argument is that XML has finally moved the inertia of the unwashed masses, who are focused on syntax. So even though Lisp could do this stuff 20 years ago, now is the time to seize the bull by the horns. Well, OK: we have a new storage syntax standard. We still need semantic models by which the dev tools can interoperate. Encoding them in XML is a side issue. Is the story that we had to wait for the unwashed masses to agree on trivia like syntax before we could develop semantic models?

In the user interaction plane, Eclipse is the best hope for an extensible dev tools framework. The article gives only passing mention to Eclipse and implies falsely that it mandates a storage format. It is completely open.

The IL issue remains outstanding. An IL should incorporate type annotations anyway, and in so doing, enables clean debugging. I'm not sure .NET is an answer. That involves JITting and Microsoft anyway, so I would vote for some open-source, cross-platform alternative.

Mark Evans - Re: Extensible Programming for the 21st Century  blueArrow
5/12/2004; 3:38:56 PM (reads: 415, responses: 0)

And one capable of multi-processing, with single-processing as a special case, not the primary design point.

To be explicit about JITting - it will not help the debugging problem.

Ziv Caspi - Re: Extensible Programming for the 21st Century  blueArrow
5/14/2004; 10:50:16 AM (reads: 159, responses: 0)

Chris: Prior to XML, it was nigh impossible to parse code

Prior to XML, it was nigh impossible to write code parsers. After XML, it is nigh impossible to read code...

(This is my impression based on XSLT: I really liked it when it came out, and used it whenever I could. The more I used it, however, the less I liked it, until I dropped it almost completely.)

Mark Evans - Re: Extensible Programming for the 21st Century  blueArrow
5/20/2004; 1:21:39 PM (reads: 75, responses: 0)
I'm even less sure that .NET is an answer.