Parsing Expression Grammars: A Recognition-Based Syntactic Foundation by Bryan Ford, MIT, 2004.

For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ambiguity is crucial to their original purpose of modelling natural languages, but this very power makes it unnecessarily difficult both to express and to parse machine-oriented languages using CFGs. Parsing Expression Grammars (PEGs) provide an alternative, recognition-based formal foundation for describing machine-oriented syntax, which solves the ambiguity problem by not introducing ambiguity in the first place. Where CFGs express nondeterministic choice between alternatives, PEGs instead use *prioritized choice*. PEGs address frequently felt expressiveness limitations of CFGs and REs, simplifying syntax definitions and making it unnecessary to separate their lexical and hierarchical components. A linear-time parser can be built for any PEG, avoiding both the complexity and fickleness of LR parsers and the inefficiency of generalized CFG parsing. While PEGs provide a rich set of operators for constructing grammars, they are reducible to two minimal recognition schemas developed around 1970, TS/TDPL and gTS/GTDPL, which are here proven equivalent in effective recognition power.

An excellent paper! I read it for the first time today and was surprised not to find it in the LtU archive.

## Recent comments

4 days 5 hours ago

4 days 7 hours ago

5 days 6 hours ago

5 days 18 hours ago

6 days 5 min ago

6 days 10 min ago

6 days 11 min ago

6 days 3 hours ago

6 days 3 hours ago

6 days 3 hours ago