Computing by deltas?

Say we have a program: F -> G. Under what situations can we generate a program which operates on *deltas* of F and G?

That is, say that dF is an efficiently stored delta value between an old and new value of F, and dG is a delta between values of G. Are there many situations where we can derive the program dF -> dG given F -> G ?

It seems like this is probably a topic with existing research, but I'm having trouble figuring out what to search for. This would be interesting in the context of a framework like React.js, where there is user code that modifies the user-interface state based on external events, then more user code which translates that state into a virtual DOM value, then finally the real DOM is modified using the virtual DOM. If we were able to derive a delta-based version of the user's code, then we could directly translate external events into real DOM manipulation, without generating an entire virtual DOM along the way.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Work from the DB community

Here is recent related work from Daniel Lupei, Christoph Koch and Val Tannen:

Incremental View Maintenance for Nested-Relational Databases

They address your problem for the NRC and then extend their approach to the STLC.

incremental computing

Incremental computing would be the common name of this research area. Self-adjusting computation is also somehow related, but I'm not sure what subtle distinctions Umat Acar is aiming for.

There are also deep relationships to reactive or dataflow programming (for which incremental computing has direct performance motivations) and live programming (which benefits from an incremental compiler).

deps + trace

I think the intention is that "self-adjusting computation" implies the combined strategies of dependency tracking (with memoization) and dynamic tracing (with replay).

This is how I understand it

This is how I understand it currently.

Incremental lambda-calculus

Thirs project, Incremental λ-Calculus, is just starting (compared to more mature approaches like SAC), with a first publication last year.

A theory of changes for higher-order languages — incrementalizing λ-calculi by static differentiation
Paolo Giarusso, Yufei Cai, Tillmann Rendel, and Klaus Ostermann. Accepted at PLDI ’14.

If the result of an expensive computation is invalidated by a small change to the input, the old result should be updated incrementally instead of reexecuting the whole computation. We incrementalize programs through their derivative. A derivative maps changes in the program’s input directly to changes in the program’s output, without reexecuting the original program. We present a program transformation taking programs to their derivatives, which is fully static and automatic, supports first-class functions, and produces derivatives amenable to standard optimization.

We prove the program transformation correct in Agda for a family of simply-typed λ-calculi, parameterized by base types and primitives. A precise interface specifies what is required to incrementalize the chosen primitives.

We investigate performance by a case study: We implement in Scala the program transformation, a plugin and improve performance of a nontrivial program by orders of magnitude.

I like the nice dependent types: a key idea of this work is that the "diffs" possible from a value v do not live in some common type diff(T), but rather in a value-dependent type diff(v). Intuitively, the empty list and a non-empty list have fairly different types of possible changes. This makes change-merging and change-producing operations total, and allow to give them a nice operational theory. Good design, through types.

Automatically computing

Automatically computing deltas is impossible for general computations, often even for declarative computations (you have to simplify the model a lot before delta computation is viable).

Even for something as simple as parsing, I find the redo/repair approach to be superior. Why this requires a virtual DOM has more to do with problems in HTML (no background rendering thread like other modern retained UI toolkits) than intrinsic problems with the approach.

If you want to see that approach taken to its extreme, see Glitch/managed time programming.

Reference on impossibility?

I've long thought that a general *efficient* solution is impossible, but I've never seen a reference to that effect (my best example are cryptographic functions). Do you have a reference?

I don't believe an efficient

I don't believe an efficient solution is possible without some form of checkpointing/logging. But there is no evidence for this; just that fighting against increased entropy and the arrow of time is often a losing battle.

Always specific

There is no general way to take a composition of `A → B` and `B → C` and translate it into a composition of `dA → dB` and `dB → dC`, at least without building a B instance statefully. So I think incremental computing will always have a rather ad-hoc nature.

However, it may be feasible to create some data types and compositions that simplify incremental computation, similar to how CRDTs simplify partitioning-tolerant computation.

There are already obvious

There are already obvious computations where precise deltas can be easily captured. Consider sum: if some element of the sum goes from N to N', the repair to the sum is to add (N' - N) to it. Easy. Now do MAX, not so easy anymore.

I would guess that the class of differentiable computations is similar, if not the same, as reversible ones, which isn't very general at all.

Extended update function

For max I guess you would need an update function of type (input: A) -> (inputDelta: dA) -> (oldOutput: B) -> (newOutput: B) (i.e. dB = B -> B). Would that work in practice?

Associative folds are "easy"

dB = B -> B makes some sense, but then it's hard to use dB to compute a further dC — since you can't pattern match on the function, you end up computing the "new B" and doing the whole computation on it.

Max is an associative fold, and there are general techniques for those. Hence, you can compute it using divide-and-conquer, and remember the tree of intermediate results (which will be balanced). Changing one element will now affect just the intermediate results depending on it (its "ancestors" in the tree), of which there's a logarithmic number — so you can process those updates in log. time.
Making updates efficient for insertion and deletion is trickier, and requires a randomized division process (see Umut Acar's PhD thesis, Sec. 9.1).

This is an instance of a general pattern: with self-adjusting computation, you just need to write your base computation so that its "computation tree" is balanced — then SAC's memoized change propagation will take care of updating the computation tree and the result when some input changes.

Something simpler

Actually I was thinking about something simpler. If the type of the update function is:

A -> dA -> B -> B

the max update function could be written for example as:

updateMax (input: Set Int, change: {added: Set Int, removed: Set Int}, oldResult: Int) = if change.removed.contains(oldResult) then max(input) else max(oldResult, max(change.added))

So if the previous max element has been removed a full re-calculation would be required, but otherwise a new max value can be calculated incrementally.

"Incremental DSLs"

To guarantee composition, you need to combine A × dA → B × dB and B × dB → C × dC. This means you might need an instance of B and C — but that was computed during the base computation, so you *just* need to remember it! (Ahem, this is the very basic idea).
Then, you need to avoid using the base inputs *so much* that your incremental computation is too slow.

Indeed, even with self-adjusting computation, incrementalizing a program takes some program-specific effort.

However, I think we want to have DSLs with primitives that are already well-incrementalized (and somewhat general), so that users can "simply" write their programs in terms of those primitives. Then the effort is to support more and more reasonable primitives, and allow for composition.

I'm at work on all of this, so hopefully you'll hear more soon.

Sounds interesting.

Please keep us posted.