Revisiting Coroutines

Revisiting Coroutines, by Ana Lucia de Moura and Roberto Ierusalimschy:

This paper defends the revival of coroutines as a general control abstraction. After proposing a new classification of coroutines, we introduce the concept of full asymmetric coroutines and provide a precise definition for it through an operational semantics. We then demonstrate that full coroutines have an expressive power equivalent to one-shot continuations and oneshot partial continuations. We also show that full asymmetric coroutines and one-shot partial continuations have many similarities, and therefore present comparable benefits. Nevertheless, coroutines are easier implemented and understood, specially in the realm of procedural languages. Finally, we provide a collection of programming examples that illustrate the use of full asymmetric coroutines to support direct and concise implementations of several useful control behaviors, including cooperative multitasking.

Coroutines seem to get fairly short riff in the literature, and they have only been discussed on LTU, a couple of times. Given coroutines have such a straightforward mapping to hardware, I hope they get more attention.

Coroutines show up in many different places. For instance, the inter-process communication (IPC) facilities of microkernels, like EROS, are a faithful implementation of asymmetric coroutines, with an important difference. Essentially, yield and resume must both take an explicit coroutine argument naming the coroutine respectively yield to and resume. If the coroutine to yield to is left implicit, as it is in most treatments I've seen, then coroutines become less composable since yield returns control to the innermost resume which, given abstract types, might be the wrong one.

This problem is discussed in Section 5.6, "Avoiding Interference Between Control Actions". The paper recommends tagging coroutines to match up resume/yield pairs, but the EROS IPC system provides a more direct encoding via a "resume" capability, which is a single-use coroutine used to return control directly to a client. Each subsequent invocation of the object synthesizes a new resume capability.

Taking this to the extreme implies that yield and resume can be unified into a single "invoke" operation which accepts a coroutine argument to be used in a subsequent invoke operation. Indeed, these are "symmetric coroutines". This paper suggests that symmetric coroutines are harder to understand due to the actors/CPS-like nature of the control flow.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Module signatures for coroutines

ML module signatures for symmetric and asymmetric coroutines. Deriving these helped me understand the discussions in the paper:

   (* asymmetric coroutines *)
   type ('a,'b) coro
   val create: ('a -> 'b) -> ('a,'b) coro
   val resume: ('a,'b) coro -> 'a -> 'b * ('c,'d) coro
   val yield: 'a -> 'b * ('c, 'd) coro

  (* symmetric coroutines *)
   type ('a,'b) coro
   val create: ('a -> 'b) -> ('a,'b) coro
   val invoke: ('a,'b) coro -> 'a -> 'b * ('c,'d) coro

You can see the problem with yield above, as it's blind to the nesting level of coroutines. Introducing a coroutine argument makes it identical to resume and invoke.

Since coroutines are single-use, I provided a purely-functional signature which returns new, typed coroutines on each return. With a linear type system, this constraint can be enforced.

Bad link

The paper sounds interesting, but your link doesn't lead anywhere.

Fixed, thanks. Anybody know

Fixed, thanks. Anybody know what's going with CiteSeer? This link to the paper, and other papers, haven't loaded in almost a week now. Usually downtimes are a day at most.

CiteseerX

I use the new site:

http://citeseerx.ist.psu.edu/

Symmetric coroutines

If yield() takes a coroutine argument, doesn't that make it equivalent to resume()? Which would imply the coroutine mechanism is symmetrical not asymmetrical.

Yes, I note this in the post

Yes, I note this in the post actually. In a follow-up comment, I also provided the function signatures I use to reason about these operations.

[Edit: I belatedly realized that perhaps you are referring to my labelling IPC as asymmetric. Is that the case? Classifying IPC isn't quite as simple as coroutine semantics would suggest, since there were concessions made due to hardware limitations, so whether IPC is based around symmetric or asymmetric coroutines could fall on either side really. However, the return operation (read: yield) is distinct from the call/send operation (resume), so I would classify it as asymmetric.

If instead, you are referring to my suggestion to add a parameter to yield, then consider that the parameter need not be reference to a coroutine, but merely to a "return" object (which encapsulates the coroutine). This is similar to the IPC mechanism.]

Hyperfunctions and coroutines

Reading through the paper, I was reminded of hyperfunctions, which are functions with the recursive type H a b = H b a -> b. They act like a (possibly infinite) sequence of functions and let you write code of the form f1 (g1 (f2 (g2 (f3 …)))), where the g's might come from the composition of arbitrarily many hyperfunctions.


newtype H a b = H { invoke :: H b a -> b }

base x = H $ const x
f << k = H $ \h -> f (invoke k h)

arr :: (a -> b) -> H a b
arr f = f << arr f

compose :: H a b -> H b c -> H a c
compose f g = H $ \k -> invoke g (compose k f)

(It turns out, people have been using hyperfunctions to model coroutines for some time now. Although judging by the Google results, there hasn't been much to report.)

Coroutines in C

I just finished a pseudo-portable symmetric coroutine library for C if anyone is interested. As suggested in this thread, there is a coro_poll function which must be called periodically to ensure the coroutine has adequate resources to continue. This polling function can shrink or grow the stack based on a hysteresis algorithm inspired by the description of one-shot continuations in Representing Control in the Presence of One-Shot Continuations.

Windows problems, stack copying

Windows has issues with stack switching which drove me to experiment with a stack copying approach. A very basic implementation inspired by sigfpe's hack works great.

We lose two orders of magnitude in performance, but use less memory, and all of C is supported (stack switching places restrictions on taking the addresses of stack variables since the stack might be relocated when grown/shrunk). With stack caching, we gain an order of magnitude in performance, so copying is only one order of magnitude slower than straight switching.

The next obvious optimization is to only lazily copy portions of the stack with an underflow handler. This is trivial to set up in assembler, but I can't think of a portable way to do it in C. Any ideas how to do this, or suggestions for further optimizations? :-)

Windows Fibers

I see that you're trying to use Standard C, but for Windows the supported way of switching stacks is to use fibers. Fibers are very similar to coroutines.

Yes, I'm aware of this,

Yes, I'm aware of this, however cloning a coroutine is an important operation (multishot continuations are not possible without it), and fibers do not have a copy or clone operation. That's why I'm trying to devise an alternative.