Coroutines as a Basis for UI Programming

I've written a short piece, illustrating the use of coroutines to build up UIs compositionally, at my blog.

I'd be curious if anyone here knows of prior research in this direction. I think there is some similarity between this approach, and immediate-mode GUI, discussed previously here. I believe that immediate-mode GUI could be seen as a special case of coroutine UI.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Interesting!

This looks very promising. It reminds me both of immediate-mode GUI and of FRP. That IMGUI article describes a manual implementation of coroutines—they’re literally doing single-threaded cooperative multitasking—so actual coroutines would seem to be a much more flexible means of expressing essentially the same thing.

IMGUI

If you're comparing to IMGUI, the big motivation there is avoiding "retained mode" state. There are a number of reasons for this: persistence, runtime update, and simplified widget lifetime management (if you don't need a widget, you just don't call it that frame).

The reason IMGUI initially seems less compositional is the global state. But this is easily addressed by stable allocation of tree-structured state to different subprograms. A widget associated with a subtree doesn't need access to the full parent tree, just its little niche. So we can get the advantages of external state (orthogonal persistence, runtime update, extension, precise resets) in addition to the advantages of local state (modularity, composition), while protecting the advantage of IMGUI (simple lifespan management).

IMGUI can be implemented a number of ways so long as it avoids hidden (local, retained-mode) state. I favor IMGUI in a more reactive model where I get the advantages of caching and limiting how much I redraw. These coroutines seem to have hidden state and independent life-spans. I suppose you were relating these coroutines somehow to the specific implementation in the linked article, rather than IMGUI as a concept?

I was reacting to the linked

I was reacting to the linked implementation, yes - as I worked out the coroutine system I remembered that the IMGUI implementation had some significant structural similarities, e.g. the explicit management of "hot" state and focus.

My intuition is that, if one were to serialize/deserialize coroutine state - including sequencing state - into a tree based on the couroutine call graph, they'd get something like the "tree-structured state" IMGUI you're talking about. E.g., if we had the following coroutine:

coroutine foo() { widget1(); widget2(); }

then it becomes the following IMGUI program:

function foo() {
  if(widget1Done) {
    return widget2();
  } else if(widget1()) {
    widget1Done = true;
  } 
}

The point about immediate

The point about immediate mode UI is that you don't need to translate to continuation passing style either by hand or automatically. In your examples, you are doing CPS-like stuff basically by hand. In IMUGR you would never have the widget1Done flag as state, but you probably can't encode long running operations either. Glitch, which I should write more about soon, can achieve non blocking concurrent execution in an immediate mode UI without CPS by simply re-executing nodes transitively on the fly as they are dependent on data values just known from long running operations.

Tree structured state

An IMGUI program for a limited-duration widget1 might look like this:

function foo(s,arg1,arg2) {
  if(!s.w1) s.w1 = { done: false, result: 0};
  if(!s.w2) s.w2 = {};
  if(!s.w1.done) { return widget1(s.w1,arg1); }
  else { return widget2(s.w2, arg2, s.w1.result); }
}

Relevant points:

  • states are relative, not absolute; the widget doesn't know its location in the larger tree
  • 'foo()' becomes a reusable widget that can be applied many times within an application for different subtrees. Same with widget1() and widget2().
  • we can *restart* foo at any time, by simply clearing the state 's'.
  • the caller of 'foo' can potentially introspect 's'. This can be used for persistence, or to add observer patterns, or to extend behavior.
  • this can be made a lot less verbose by appropriate use of language

Besides state, one can include arguments like mouse position relative to the widget, etc.

There are also useful "pure functional" variations on this where we don't allow the state to change until the next frame. It's easy to model, and can lead to more robust applications and parallel computations.

It isn't clear to me what you were imagining with tree-structured state. But a global 'widget1done' flag certainly isn't right.

Coroutine Translation

The translation of this widget into coroutines with explicit continuations would look like this, I think:

function foo(arg1, arg2) {
  var w1 = widget1(arg1);
  if(w1.type == 'finish') {
    return afterW1(w1.result);
  } else {
    return {type: 'yield', value: w1.value, continuation: function run1(input) {
      w1 = w1.continuation(input);
      if(w1.type == 'finish') { 
        return afterW1(w1.result); 
      } else {
        return {type: 'yield', value: w1.value, continuation: run1};
      }
    }};
  }
  function afterW1(result) {
    return widget2(arg2, result);
  }
}

I think that the two are very similar. The main difference is that the current state of each widget is stored in a closure rather than an explicit tree structure. The fact that widget1 is done is stored by returning a continuation which jumps past widget1, rather than an explicit done flag. But the the control flow is almost the same, at least for this example.

Of your five points, 1, 2, and 5 are true of coroutines as well. 3 is true, assuming that we can store continuations from a partially-run coroutine. 4 is generally not true of the coroutine approach, because every closure implementation I know of is opaque.

re: Coroutine translation

One difference you missed: in the IMGUI widget, `arg1` and `arg2` could change every frame, enabling them to behave as time-varying signals. This puts a relatively reactive UI framework on a path-of-lesser-resistance.

Anyhow, I don't disagree that coroutines can be used to obtain similar features. I was mostly concerned that your IMGUI example wasn't demonstrating some of the reusability properties and relative advantages of which it is capable.

Without tree-structured data (or a suitable alternative) immediate mode would be unable to achieve points 1 and 2, and would be very ad-hoc for point 3. With tree-structured data, immediate mode can be more implicitly extensible and controllable than retained-mode (due to points 3,4). And with reactive programming, it can also have the same incremental computation advantages as retained mode.

Process Algebra based GUI specification

Related might be the Process Algebra (ACP) based GUI specification as done in my Scala extension named SubScript; see here. That contains an example with a separation between the place where where widgets are constructed and added to the window, and the place with scripts for the widget behaviors. I find your approach interesting because it takes only one place and it allows for different kinds of compositions.

Maybe something slightly similar would be possible with SubScript. The scripts from the example describe how a typical search with a text field and a button is performed, starting with a click event on the button:

searchSequence = searchCommand  showSearchingText searchInDatabase showSearchResults
searchCommand  = clicked(searchButton)

When the seachCommand and with it the clicked script are activated, the latter enables the button, using "activation code". When the user clicks, the script deactivates, and using "deactivation code" the button gets disabled.
In principle, the activation&deactivation code can also take care of creation, placement, removal and destruction of the button. Then there would only one place for the initialization, finalization and behavior of the buttons, but this would lead to an unquiet look and feel in this example case.

However, it might be possible (still TBD: get it working; not all features have been implemented yet) to write for example a simple temperature converter like:

converterApp = numericInput &==> fahrenheitToCelcius &==> numericOutput

&==> denotes parallel composition (&) with possible network communication over channels from left to right; it is much like a pipe in Unix shell language.

The conversion is done in:

fahrenheitToCelcius = =>f?:Double  val c = (f-32)*5/9 <=c ...

This describes a sequential loop (...) of receiving a number f from an input channel; then converting it to a number c and then sending c over the output channel.

Two other scripts would take care of creating the widgets and adding these to the application window and finally removing these.
Script numericInput has a main loop that (each time) calls the script valueChanged. This accepts an event from a numeric text field and yields a numeric result, which is is forwarded (==>) to an action that sends it (_) over an output channel.
Script numericOutput has a main loop that receives a value from an input channel and then sets the label text accordingly.

numericInput = var nf = new TextField
                         @{window.add(nf); there.onDeactivate{window.remove(nf)}:
                         ( valueChanged(nf) ==> <=(_); ... )

numericOutput = var nl = new Label {preferredSize = new Dimension(65,26)}
                         @{window.add(nl); there.onDeactivate{window.remove(nl)}:
                         ( =>d?:Double {nl.text = d.toString} ... )

These two scripts could be imported from some library trait, that would need proper handles for adding and removing the widgets, of course. The trait would also provide a definition for the channel:

<==>(d:Double) = {}

I am not sure whether this way of programming would be really handy for GUIs, but it is different and could be interesting.
The data flow using the arrow symbols is explained in my paper for the Scala Workshop, last July.