Interesting blog post by Crista Lopes. Here is some text from the bottom that struck a chord with me:
We've talked a little about programming language design research before.
And let me also repeat CBPV's slogan, which is one of the finest in PL advocacy: Once the fine structure has been exposed, why ignore it?
I am curious why it seems there is little or no research on adapting grammars and parsers to support parametric polymorphism. It seems to me that grammars and types are more or less the same thing, and technology related to polymorphic type systems should apply to grammars and parsing as well.
Existing grammar technology has woeful compositional properties compared to type systems. I am currently using Dypgen which allows dynamic extension of a grammar. But first let me backtrack a bit:
Suppose to have an executable recursive descent parser for statements, where the parser accepts a list of statement forms and tries each one until it succeeds. If you put that list in a global variable, it is easy to extend the system by constructing a suitable data structure for parsing a statement at run time, push it onto the statement list and store the resulting list in the global variable.
The use of a global variable here rather than a weak functional technique is mandatory when you consider that some statements may be composed from others, and we want the recursion to extend nested statements to include the new production too.
Now as to Dypgen, it is better because it is purely functional in that after adding a new production for a statement, it rebuilds the parser engine, and so the recursion required to support nested statements works.
BUT .. we are still adding a new production to a statement, which is similar to hacking an Ocaml variant type and adding a new case, then recompiling. It's not the recompilation that concerns me here, but the fact we're forced to modify the old grammar to extend it.
The thing is the *right way(tm)* to do this would seem to be to use open recursion: in Ocaml you can use polymorphic variants with a parameter which is closed to form a concrete type, and for an extension you can add new variants to the open form and then close that. With this technology we have real subtyping: we have a type which is open for modification, and can be trivially closed for use, thus satisfying the open/closed principle.
Why can't we do this for grammars?
There are some real trivial uses for this. Dypgen supports 3 polymorphic operators already, namely * + and ?. But now, suppose I want to define "comma separated list of arbitrary-nonterminal" which in fact I need a lot.
I'm being asked to use a technology so seriously archaic it is worse than Basic or Cobol: it doesn't even have "subroutines". What I need here is actually quite flat: it needs parametric grammars, though not open recursion.
Given the huge amount of research into type systems .. why am I still using Assembler to write my grammars?
Active forum topics
New forum topics