Towards efficient, typed LR parsers

Towards efficient, typed LR parsers, François Pottier and Yann Régis-Gianas. In ACM Workshop on ML, March 2006.

The LR parser generators that are bundled with many functional programming language implementations produce code that is untyped, needlessly inefficient, or both. We show that, using generalized algebraic data types, it is possible to produce parsers that are well-typed (so they cannot unexpectedly crash or fail) and nevertheless efficient. This is a pleasing result as well as an illustration of the new expressiveness offered by generalized algebraic data types.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Derivation of a Typed Functional LR Parser

This makes me think of this (PDF).

Abstract. This paper describes a purely functional implementation of LR parsing. We formally derive our parsers in a series of steps starting from the inverse of printing. In contrast to traditional implementations of LR parsing, the resulting parsers are fully typed, stackless and table-free. The parsing functions pursue alternatives in parallel with each alternative represented by a continuation argument. The direct implementation presents many opportunities for optimization and initial measurements show excellent performance in comparison with conventional table-driven parsers.

That paper just blew my mind

That paper just blew my mind. I think it's a very good demonstration of how program derivation can yield new unexpected and very efficient programs. The results from this paper have been used to implement the parser generator Frown. It generates Haskell parsers. The resulting parsers are apparently faster than those generated by Happy, despite all the tricks Happy uses to speed them up.

Cool!

I hadn't seen that one before -- thanks!

which major ML implementations support GADTs?

I am all for GADTs :-) Does anybody know of implementations of GADTs in ML, or plans to support them? Haskell (ghc) has them and Scala, too, but I would prefer smlnj or polyml or ocaml to have them, rather than downloading some special purpose tarball. I should add that I am not tracking ML compilers as a hobby, so I might overlook solutions long-known to afficionados. Any pointers appreciated.