archives

Parsing Expression Grammars

Parsing Expression Grammars: A Recognition-Based Syntactic Foundation by Bryan Ford, MIT, 2004.

For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ambiguity is crucial to their original purpose of modelling natural languages, but this very power makes it unnecessarily difficult both to express and to parse machine-oriented languages using CFGs. Parsing Expression Grammars (PEGs) provide an alternative, recognition-based formal foundation for describing machine-oriented syntax, which solves the ambiguity problem by not introducing ambiguity in the first place. Where CFGs express nondeterministic choice between alternatives, PEGs instead use prioritized choice. PEGs address frequently felt expressiveness limitations of CFGs and REs, simplifying syntax definitions and making it unnecessary to separate their lexical and hierarchical components. A linear-time parser can be built for any PEG, avoiding both the complexity and fickleness of LR parsers and the inefficiency of generalized CFG parsing. While PEGs provide a rich set of operators for constructing grammars, they are reducible to two minimal recognition schemas developed around 1970, TS/TDPL and gTS/GTDPL, which are here proven equivalent in effective recognition power.

An excellent paper! I read it for the first time today and was surprised not to find it in the LtU archive.

Worlds: Controlling the Scope of Side Effects

Worlds: Controlling the Scope of Side Effects by Alessandro Warth and Alan Kay, 2008.

The state of an imperative program -— e.g., the values stored in global and local variables, objects’ instance variables, and arrays—changes as its statements are executed. These changes, or side effects, are visible globally: when one part of the program modifies an object, every other part that holds a reference to the same object (either directly or indirectly) is also affected. This paper introduces worlds, a language construct that reifies the notion of program state, and enables programmers to control the scope of side effects. We investigate this idea as an extension of JavaScript, and provide examples that illustrate some of the interesting idioms that it makes possible.

This introduces a new programming construct that's just the kind I love: they stimulate the imagination and provide simple and strong dynamic invariants to make programs easy to reason about.