archives

Derivatives of Regular Expressions

Derivatives of Regular Expressions, Janusz Brzozowski, Journal of the ACM 1964.

Kleene's regular expressions, which can be used for describing sequential circuits, were defined using three operators (union, concatenation and iterate) on sets of sequences. Word descriptions of problems can be more easily put in the regular expression language if the language is enriched by the inclusion of other logical operations. However, in the problem of converting the regular expression description to a state diagram, the existing methods either cannot handle expressions with additional operators, or are made quite complicated by the presence of such operators. In this paper the notion of a derivative of a regular expression is introduced and the properties of derivatives are discussed. This leads, in a very natural way, to the construction of a state diagram from a regular expression containing any number of logical operators.

This is one of my favorite papers. It describes a very cute algorithm for building deterministic finite automata directly from a regular expression. The key trick is the idea of a derivative of a regular expression with respect to a string, which is a non-obvious but fun idea.

Note: This is an ACM DL link; I couldn't find the paper freely available online. :(

C++ Historical Sources Archive

Seeing as we just had a lively discussion of Stroustrup’s HOPL paper, it's more than appropriate to mention another great resource courtesy of Paul McJones: The C++ Historical Sources Archive.

Among the treasures are the source code of the Cfront releases, but there's much more there so go take a peek.