archives

Parsing with error recovery?

Are there parser DSLs which allow you to 'nicely' specify error handling? It seems natural with, say, parser combinators, though something that is otherwise more akin to LL(1) would be more interesting.

The context is that some of the most common languages are generally thought of as LL(1), CFGs, etc., yet, in reality, their informal specifications allow some forms of erroneous input and actually define how to handle them. This means they're not really erroneous, but pockets of complication that should be somewhat uniformly handled in an isolated manner.

Specifying semantics and type rules

There are relatively few language specifications out there which incorporate a formal statement of semantics and typing rules. Of those that do, the one that I'm most familiar with (ML) uses a fairly hard to read notation.

We're at the point where we want to incorporate such a statement into the BitC specification, and I'ld appreciate pointers, inputs, and suggestions on how we might express these things with the greatest clarity and precision. It seems to me, naively, that we need to specify how the full language syntax maps (syntactically) to a canonicalized core syntax, and then specify the semantics and typing rules for that core. At the moment, I'm attracted to the notation used by Pierce's book (and of course by others). I'm willing to sacrifice textual compactness for clarity. On the other hand, the idea of formalizing directly in Coq has its appeal.

Are there examples of specifications out there that we should look at as potential exemplars? If so, why are they exemplars? What makes them good. Conversely, are there examples out there of approaches to avoid, and why?

Aside: this isn't the right forum for an extended discussion of formalizing BitC in particular, but if you would like to participate in that process, please do come join the appropriate mailing list.