Implementation

Lowering: A Static Optimization Technique for Transparent Functional Reactivity

Lowering: A Static Optimization Technique for Transparent Functional Reactivity, Kimberley Burchett, Gregory H. Cooper, and Shriram Krishnamurthi. PEPM 2007.

Functional Reactive Programming (FRP) extends traditional functional programming with dataflow evaluation, making it possible to write interactive programs in a declarative style. An FRP language creates a dynamic graph of data dependencies and reacts to changes by propagating updates through the graph. In a transparent FRP language, the primitive operators are implicitly lifted, so they construct graph nodes when they are applied to time-varying values. This model has some attractive properties, but it tends to produce a large graph that is costly to maintain. In this paper, we develop a transformation we call lowering, which improves performance by reducing the size of the graph. We present a static analysis that guides the sound application of this optimization, and we present benchmark results that demonstrate dramatic improvements in both speed and memory usage for real programs.

Whenever I read about compiler optimizations, I try (with mixed success) to relate them to transformations in the lambda calculus. I haven't managed to figure out what's going on with the dip construct the authors propose, but I would guess that the place to look is the proof theory of the necessity operator in modal logic -- dataflow programming can be seen as a kind of stream programming, and streams form a comonad over the lambda calculus, and comonads give semantics to the modal necessity (box) operator.

Threads in JavaScript?

Threads in JavaScript? "Over your dead body," says Brendan.

But Neil Mix begs to differ -- they're already there!

Neil's latest blog post presents a cool hack combining JavaScript 1.7's generators with trampolined style to implement very lightweight cooperative threads.

The implementation weighs in at a breathtakingly small 4k.

Regular Expression Matching Can Be Simple And Fast

With Peter's observation that everything good in Computer Science happened during the "Golden Age" freshly in mind, I found Russ Cox's recent article on regular expressions to be enjoyable reading.

This is a tale of two approaches to regular expression matching. One of them is in widespread use in the standard interpreters for many languages, including Perl. The other is used only in a few places, notably most implementations of awk and grep. The two approaches have wildly different performance characteristics ... The trends shown in the graph continue: the Thompson NFA handles a 100-character string in under 200 microseconds, while Perl would require over 10^15 years.

Combining implementation details, finite automata, and a foray into decades-old theory, this article shows how most of our favorite little languages have an enormous performance bottlenecks for certain categories of string comparisons.

An additional data point: The Shootout benchmarks have a large string comparison test. It's interesting that Tcl is at the top of the heap for performance. Guess which one is using the Thompson NFA algorithm for regular expressions?

Lightweight Fusion by Fixed Point Promotion

Lightweight Fusion by Fixed Point Promotion, Atsushi Ohori and Isao Sasano.

This paper proposes a lightweight fusion method for general recursive function definitions. Compared with existing proposals, our method has several significant practical features: it works for general recursive functions on general algebraic data types; it does not produce extra runtime overhead (except for possible code size increase due to the success of fusion); and it is readily incorporated in standard inlining optimization. This is achieved by extending the ordinary inlining process with a new fusion law that transforms a term of the form f o (fix g.λx.E) to a new fixed point term fix h.λx.E' by promoting the function f through the fixed point operator.

This is a sound syntactic transformation rule that is not sensitive to the types of f and g. This property makes our method applicable to wide range of functions including those with multi-parameters in both curried and uncurried forms. Although this method does not guarantee any form of completeness, it fuses typical examples discussed in the literature and others that involve accumulating parameters, either in the foldl-like specific forms or in general recursive forms, without any additional machinery.

In order to substantiate our claim, we have implemented our method in a compiler. Although it is preliminary, it demonstrates practical feasibility of this method.

Deforestation is one of those optimizations every functional programmer who has ever had to rewrite a beautiful composition of maps and filters into an evil, ugly explicit fold has always longed for. Unfortunately, the standard lightweight fusion algorithms have trouble with examples as simple as foldl, and this paper has a very nice account of a simple algorithm that can handle it.

The Missing Link - Dynamic Components for ML

The Missing Link - Dynamic Components for ML, Andreas Rossberg. ICFP 2006.

Despite its powerful module system, ML has not yet evolved for the modern world of dynamic and open modular programming, to which more primitive languages have adapted better so far. We present the design and semantics of a simple yet expressive first-class component system for ML. It provides dynamic linking in a type-safe and type-flexible manner, and allows selective execution in sandboxes. The system is defined solely by reduction to higher-order modules plus an extension with simple module-level dynamics, which we call packages. To represent components outside processes we employ generic pickling. We give a module calculus formalising the semantics of packages and pickling.

This is a very nice paper showing how to integrate dynamic loading into the ML module system. Er, I guess I'm repeating the abstract. I thought this paper, in addition to the feature it gave, was also a good demonstration of how to put the Dreyer-Crary-Harper account of ML modules to work.

ACM Queue: Realtime Garbage Collection

The Metronome technology, developed at IBM Research and now available in production, solves [the problem of realtime GC] by limiting interruptions to one millisecond and spacing them evenly throughout the application's execution.2 Memory consumption is also strictly limited and predictable, since an application can't be realtime if it starts paging or throws an out-of-memory exception.

The article is primarily about the Metronome garbage collection technology, but includes a nice introduction to garbage collection and realtime terminology as well.

State of the Union: Type Inference via Craig Interpolation

State of the Union: Type Inference via Craig Interpolation, by Ranjit Jhala, Rupak Majumdar, and Ru-Gang Xu

The ad-hoc use of unions to encode disjoint sum types in C programs and the inability of C's type system to check the safe use of these unions is a long standing source of subtle bugs. We present a dependent type system that rigorously captures the ad-hoc protocols that programmers use to encode disjoint sums, and introduce a novel technique for automatically inferring, via Craig Interpolation, those dependent types and thus those protocols. In addition to checking the safe use of unions, the dependent type information inferred by interpolation gives programmers looking to modify or extend legacy code a precise understanding of the conditions under which some fields may be safely accessed. We present an empirical evaluation of our technique on 350KLOC of open source C code. In 80 out of 90 predicated edges (corresponding to 1472 out of 1684 union accesses), our type system is able to infer the correct dependent types. This demonstrates that our type system captures and explicates programmers' informal reasoning about unions, without requiring manual annotation or rewriting.

Programming the Greedy CAM Machine

Programming the Greedy CAM Machine. Erik Ruf. January 2007

The Greedy CAM architecture describes a class of experimental processors that aim to cope with memory latency and enable parallelism by combining a streams-and-kernels execution model with a relational-query-based memory model. This article focuses on the programming abstraction (equivalent to the ISA in more conventional systems) of Greedy CAM systems, as exemplified by a low-level intermediate language. Using a series of small example programs, we demonstrate several programming idioms and analyze their performance using a simple functional-level simulator. We also suggest extensions needed for the implementation of higher-level programming abstractions.

Section 6.8 is on the suitability of LINT, the low-level intermediate language described in the paper, as a target language for the compilation of higher-level abstractions. But comments on the general issues discussed in the paper are welcome as well...

Back to the Future

Back to the future: the story of Squeak, a practical Smalltalk written in itself by Dan Ingalls, Ted Kaehler, John Maloney, Scott Wallace, Alan Kay, 1997.

Squeak is an open, highly-portable Smalltalk implementation whose virtual machine is written entirely in Smalltalk, making it easy to debug, analyze, and change. To achieve practical performance, a translator produces an equivalent C program whose performance is comparable to commercial Smalltalks.

Other noteworthy aspects of Squeak include: a compact object format that typically requires only a single word of overhead per object; a simple yet efficient incremental garbage collector for 32-bit direct pointers; efficient bulk-mutation of objects; extensions of BitBlt to handle color of any depth and anti-aliased image rotation and scaling; and real-time sound and music synthesis written entirely in Smalltalk.

This paper is so good that it's hard to believe it was written after 1990!

A Garbage-Collecting Typed Assembly Language

A Garbage-Collecting Typed Assembly Language. Chris Hawblitzel; Heng Huang; Lea Wittie; Juan Chen.

Abstract Typed assembly languages usually support heap allocation safely, but often rely on an external garbage collector to deallocate objects from the heap and prevent unsafe dangling pointers. Even if the external garbage collector is provably correct, verifying the safety of the interaction between TAL programs and garbage collection is nontrivial. This paper introduces a typed assembly language whose type system is expressive enough to type-check a Cheney-queue copying garbage collector, so that ordinary programs and garbage collection can co-exist and interact inside a single typed language. The only built-in types for memory are linear types describing individual memory words, so that TAL programmers can define their own object layouts, method table layouts, heap layouts, and memory management techniques.

The TAL-GC proofs can be found here.

XML feed