Lowering: A Static Optimization Technique for Transparent Functional Reactivity

Lowering: A Static Optimization Technique for Transparent Functional Reactivity, Kimberley Burchett, Gregory H. Cooper, and Shriram Krishnamurthi. PEPM 2007.

Functional Reactive Programming (FRP) extends traditional functional programming with dataflow evaluation, making it possible to write interactive programs in a declarative style. An FRP language creates a dynamic graph of data dependencies and reacts to changes by propagating updates through the graph. In a transparent FRP language, the primitive operators are implicitly lifted, so they construct graph nodes when they are applied to time-varying values. This model has some attractive properties, but it tends to produce a large graph that is costly to maintain. In this paper, we develop a transformation we call lowering, which improves performance by reducing the size of the graph. We present a static analysis that guides the sound application of this optimization, and we present benchmark results that demonstrate dramatic improvements in both speed and memory usage for real programs.

Whenever I read about compiler optimizations, I try (with mixed success) to relate them to transformations in the lambda calculus. I haven't managed to figure out what's going on with the dip construct the authors propose, but I would guess that the place to look is the proof theory of the necessity operator in modal logic -- dataflow programming can be seen as a kind of stream programming, and streams form a comonad over the lambda calculus, and comonads give semantics to the modal necessity (box) operator.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Our very own Kimberley

Our very own Kimberley Burchett!!

dip

It seems to me (I could be wrong) that (dip (x y z) e) is equivalent to ((lift (lambda (x y z) e)) x y z). The point being that, if one wishes to create a signal node performing some "complex" operation on some signals (say, x+y*z), one could either create a "flat" (non-reactive) function performing that operation, lift it, and apply it to its dependent signals, or use the dip construct to create the signal node directly. Of course, the intent is not for dip to be used directly, but for it to be used by the optimizer to meld together the multiple signal nodes which would be generated from a reactive expression such as x+y*z (which would normally create two signal nodes, one for * and one for +).

Exactly

Yes, exactly. I originally invented "dip" in order to make it easier to define the optimization compositionally. Once I had the inference rules worked out (using dip), it also turned out to be a pretty straightforward way to actually write the implementation.

By the way, the idea behind dip applies to more than just FRP languages. In general, it applies to any monad where where lift (f . g) == (lift f) . (lift g) (as noted in the final section of the paper). For heavyweight monads, like in FRP, it can be a big win.

The Stars in Their Courses

Very nice paper.

I have a quibble with the example (Figs 1 & 2). They compute the position of a planet in a circular orbit as

(R cos theta, R sin theta)

and for theta they use (/ speed seconds) where seconds is the time signal. But even if "speed" is angular velocity (where it would be more commonly named "omega"), the formula should be that times time, not divided by. One imagines the planets slowly backpedalling down the ecliptic towrds the vernal equinox... (as time in the denominator goes to infinity). And really, if the parameter is speed, angular velocity would be speed/R (i.e. (/ speed dist) in the program's terms).

(end quibble)

The naive implementation of reactive programming would simply be to run your entire conventional program (functional or not, as long as it terminates) to completion between each timestep. FrTime (and Yampa) appear, AFAICT, to try and optimize this by memoizing every value point in the program and shorting if the value doesn't change between cycles. "Dipping" appears to be a way of going back to the naive form in patches. Given their implementation, an excellent idea, though if I were writing a reactive language from scratch (oddly enough, I am), I would start from the naive form and only put the memoization in where it would help in the first place, rather than putting it in everywhere and then selectively taking it out.

Josh

There are several advantages

There are several advantages to the way we did it (i.e. removing lifts wherever possible rather than trying to only insert lifts where appropriate). We did consider doing it the other way. We referred to the approach described in the paper as "optimization", and the approach you suggest as "compilation".

From my perspective, the biggest advantage of the optimization approach is that if an optimizer comes across an unrecognized language construct, it can just throw up its hands and leave the original code unmodified (application of higher order functions is a prime example of this). In contrast, by relying on a compiler to add dataflow semantics only where necessary, you force the compiler to support every possible language feature.

The optimization approach also allows the end programmer to completely disable optimization. This might be desirable if, say, the optimizer produces incorrect code. In contrast, disabling a compiler would guarantee incorrect code.

Finally, by making the optimization into a separate, optional pass, you end up simplifying the language implementation. It's entirely possible to reason about the implementation of FrTime without thinking about dipping at all (I know this is true, because FrTime was fully implemented before dipping was even thought of). In contrast, if you relied on a source-to-source transform to insert lifts where appropriate, you'd need to take that into consideration whenever extending the FrTime language.

Your point about the planet example is well taken, though :) I wasn't thinking especially clearly that night.

compilation

Certainly. What I'm trying to do is implement a small, self-contained language that I can compile down to machine code for robot controllers and the like. So I'm expecting the compiler to have to know about all the language features...

But I think I'm still missing something. In many cases, controller programs are polled rather than interrupt-driven; every piece of code is simply run at pre-determined intervals (ranging from microseconds to seconds). Check out the Hardware Abstraction Layer for a reactive system that works this way (running in the kernel space in real-time Linux).

The point is, if you do things this way, your program can do any one-shot computation you like, including higher-order functions etc, and you just consider the whole thing one "dipped" block.

So my question is, what is there about the semantics (as opposed to the implementation) of FrTime that would keep this from working there?

Josh

Stateful nodes

FrTime has the ability to support stateful nodes in the dataflow graph. For example the delay operator can delay a sequence of events by an arbitrary amount of time. Similarly, the derivative and integral operators won't work unless they're able to remember the previous values of their inputs.

If you were to try to dip these operators, you would get incorrect results -- dipping only works for pure functions, not stateful ones. In other words, the correctness of FrTime programs depends on the fact that the dataflow graph is persistent.

Stateful nodes

Derivative, integral, and so forth are of course crucial to most signal processing and control applications ... but I conjecture that a single stateful operation suffices: a function that maps a signal onto its history as a list or vector or whatever linear datastructure your static language provides (or if the timesteps are not regular, a double list or alist or the like). Sum it for integral, sample it at a fixed offset for delay, etc.

So: define the other stateful primitives in terms of this one, and wherever it appears in the program, stick in a sequence accumulator.

Is there any stateful function (other than negative delay :-) this wouldn't work for?

Josh

Even simpler

The system need only keep track of the most recent state of any given node -- the node itself can build a history list if that is what is desired.

And yes, this works for all (even noncausal!) state functions. In control theory, any* system can be represented as a first-order differential matrix equation (this is the so-called "state-space" representation): the change in state of some system (i.e. its next state) is purely a function of the input and the current state of the system. The order of the state matrix corresponds directly to the "amount" of state the system has (i.e. how much it can remember).

* "any system" meaning any system which is both linear (due to the use of diff eqs) and lumped (meaning the state can be broken up into discrete chunks and represented as a vector), but these qualifiers don't affect the answer to your question, since computers aren't restricted to diff eqs and their state is already discretized.

Recursion considered Harmful

Turns out that the "naive" implementation is a little more involved than it first appeared (but what's new?).

In particular, when you "stick in the sequence accumulator," you need a table of accumulators indexed by something approximating the call stack at the time/place of invocation. This assumes of course a purely functional program -- I don't even know what it means if it's imperative.

The tables are a smaller set than the full DFG; it remains to be seen how it stacks up against the optimized DFG.

Josh