On Iteration

On Iteration, by Andrei Alexandrescu.

Lisp pioneered forward iteration using singly-linked lists. Later object-oriented container designs often used the Iterator design pattern to offer sequential access using iterators. Though iterators are safe and sensible, their interface prevents definition of flexible, general, and efficient container-independent algorithms. For example, you can't reasonably expect to sort, organize as a binary heap, or even reverse a container by just using its Iterator. At about the same time, C++'s Standard Template Library (STL) defines its own conceptual hierarchy of iterators and shows that container-independent algorithms are possible using that hierarchy. However, STL iterators are marred by lack of safety, difficulty of usage, difficulty of definition, and a very close relationship to C++ that limits adoption by other languages. I propose an API that combines the advantages of Iterator and STL, and I bring evidence that the proposed abstraction is sensible by implementing a superset of STL's algorithms in the D language's standard library.

Previously: Iterators Must Go.

Project Sikuli

Picture or screenshot driven programming from the MIT.

From the Sikuli project page:

Sikuli is a visual technology to search and automate graphical user interfaces (GUI) using images (screenshots). The first release of Sikuli contains Sikuli Script, a visual scripting API for Jython, and Sikuli IDE, an integrated development environment for writing visual scripts with screenshots easily.

Differentiating Parsers

A fascinating article by Oleg Kiselyov on delimited continuations:

We demonstrate the conversion of a regular parser to an incremental one in byte-code OCaml. The converted, incremental parser lets us parse from a stream that is only partially known. The parser may report what it can, asking for more input. When more input is supplied, the parsing resumes. The converted parser is not only incremental but also undoable and restartable. If, after ingesting a chunk of input the parser reports a problem, we can `go back' and supply a different piece of input.

The conversion procedure is automatic and largely independent of the parser implementation. The parser should either be written without visible side effects, or else we should have access to its global mutable state. The parser otherwise may be written with no incremental parsing in mind, available to us only in a compiled form. The conversion procedure relies on the inversion of control, accomplished with the help of delimited continuations provided by the delimcc library.

Expressive Modes and Species of Language

R. E. Moe, 2006. Expressive Modes and Species of Language. In Proc. 23rd World Academy of Science, Engineering and Technology.

This paper is very relevant to my attempts to characterise what I call the Scott-Strachey school of computer science; cf. Right On!.

I am grateful to Ehud for bringing the following paper to my attention: White, G., 2004; The Philosophy of Computer Languages; In L. Floridi (ed.): Philosophy of Computing and Information; Blackwell, 2004.

Iterators Must Go

Andrei Alexandrescu: Iterators Must Go, BoostCon 2009 keynote.

Presents a simple yet far-reaching replacement for iterators, called ranges, and interesting "D" libraries built on it: std.algorithm and std.range.

Ranges pervade D: algorithms, lazy evaluation, random numbers, higher-order functions, foreach statement...

(Related: SERIES, enumerators, SRFI 1, and The Case For D by the same author)

Computer music: a bastion of interactive visual dataflow languages

The area of computer music has been designing systems for aiding music composition and performance for over 40 years now. The developers have not been much influenced by mainstream programming fads, but have been driven mainly by the needs of composers and performers. Current systems keep the principal original metaphor, the patchboard: a collection of live modules ("patches") connected by cables and easily inspected and modified on the fly. This metaphor has full GUI support and is extended with interactive visual tools for abstraction and exploration, such as the maquette. The language semantics are based on deterministic concurrency, with controlled use of nondeterminism and state.

Current systems are full-fledged programming platforms with visual dataflow languages. Two examples among many are the Max/MSP platform for controlling and generating music performances in real time and the OpenMusic design studio for composers. Max/MSP has two visual dataflow languages: one for control and one for real-time waveform generation. OpenMusic has a dataflow language controlled interactively with many tools for composers.

These systems are actually general-purpose programming platforms. They show that visual dataflow can be made both practical and scalable. In my view, this is one promising direction for the future of programming.

The Art of the Propagator

The Art of the Propagator, Alexey Radul and Gerald Jay Sussman.

We develop a programming model built on the idea that the basic computational elements are autonomous machines interconnected by shared cells through which they communicate. Each machine continuously examines the cells it is interested in, and adds information to some based on deductions it can make from information from the others. This model makes it easy to smoothly combine expression-oriented and constraint-based programming; it also easily accommodates implicit incremental distributed search in ordinary programs. This work builds on the original research of Guy Lewis Steele Jr. and was developed more recently with the help of Chris Hanson.

I just ran across this tech report. I haven't read it yet, but the subject is particularly well-timed for me, since I just finished a correctness proof for a simple FRP system implemented via imperative dataflow graphs, and so constraint propagation has been much on my mind recently. It's pretty clear that constraint propagation can do things that FRP doesn't, but it's not so clear to me whether this is a case of "more expressiveness" or "more fragile abstractions".

Linear Logic and Permutation Stacks--The Forth Shall Be First

Linear Logic and Permutation Stacks--The Forth Shall Be First by Henry Baker, 1993.

Girard's linear logic can be used to model programming languages in which each bound variable name has exactly one "occurrence"--i.e., no variable can have implicit "fan-out"; multiple uses require explicit duplication. Among other nice properties, "linear" languages need no garbage collector, yet have no dangling reference problems. We show a natural equivalence between a "linear" programming language and a stack machine in which the top items can undergo arbitrary permutations. Such permutation stack machines can be considered combinator abstractions of Moore's Forth programming language.

I remembered this paper while chatting with a friend who's designing a stack-based instruction set and looking for relevant compilation techniques (imagine compiling C to Forth). Do you have some relevant references?

Today I found this paragraph particularly intriguing:

Since Forth is usually implemented on a traditional von Neumann machine, one thinks of the return stack as holding "return addresses". However, in these days of large instruction caches, in which entire cache lines are read from the main memory in one transaction, this view should be updated. It is well-known that non-scientific programs have a very high rate of conditional branches, with the mean number of instructions between branches being on the order of 10 or less. Forth programs are also very short, with "straight-line" (non-branching) sequences averaging 10 items or less. In these environments, it makes more sense to view the return stack itself as the instruction buffer cache! In other words, the return stack doesn't hold "return addresses" at all, but the instructions themselves! When a routine is entered, the entire routine is dumped onto the top of the return stack, and execution proceeds with the top item of this stack. Since routines are generally very short, the transfer of an entire routine is about the same amount of work as transferring a complete cache line in present architectures. Furthermore, an instruction stack-cache-buffer is normally accessed sequentially, and therefore can be implemented using shift register technology. Since a shift register can be shifted faster than a RAM can be accessed, the "access time" of this instruction stack-cache-buffer will not be a limiting factor in a machine's speed. Executing a loop in an instruction stack-cache-buffer is essentially the making of connections necessary to create a cyclic shift register which literally cycles the instructions of the loop around the cyclic shift register.

Imagine that!

Worlds: Controlling the Scope of Side Effects

Worlds: Controlling the Scope of Side Effects by Alessandro Warth and Alan Kay, 2008.

The state of an imperative program -— e.g., the values stored in global and local variables, objects’ instance variables, and arrays—changes as its statements are executed. These changes, or side effects, are visible globally: when one part of the program modifies an object, every other part that holds a reference to the same object (either directly or indirectly) is also affected. This paper introduces worlds, a language construct that reifies the notion of program state, and enables programmers to control the scope of side effects. We investigate this idea as an extension of JavaScript, and provide examples that illustrate some of the interesting idioms that it makes possible.

This introduces a new programming construct that's just the kind I love: they stimulate the imagination and provide simple and strong dynamic invariants to make programs easy to reason about.

Parsing Expression Grammars

Parsing Expression Grammars: A Recognition-Based Syntactic Foundation by Bryan Ford, MIT, 2004.

For decades we have been using Chomsky's generative system of grammars, particularly context-free grammars (CFGs) and regular expressions (REs), to express the syntax of programming languages and protocols. The power of generative grammars to express ambiguity is crucial to their original purpose of modelling natural languages, but this very power makes it unnecessarily difficult both to express and to parse machine-oriented languages using CFGs. Parsing Expression Grammars (PEGs) provide an alternative, recognition-based formal foundation for describing machine-oriented syntax, which solves the ambiguity problem by not introducing ambiguity in the first place. Where CFGs express nondeterministic choice between alternatives, PEGs instead use prioritized choice. PEGs address frequently felt expressiveness limitations of CFGs and REs, simplifying syntax definitions and making it unnecessary to separate their lexical and hierarchical components. A linear-time parser can be built for any PEG, avoiding both the complexity and fickleness of LR parsers and the inefficiency of generalized CFG parsing. While PEGs provide a rich set of operators for constructing grammars, they are reducible to two minimal recognition schemas developed around 1970, TS/TDPL and gTS/GTDPL, which are here proven equivalent in effective recognition power.

An excellent paper! I read it for the first time today and was surprised not to find it in the LtU archive.

XML feed