Adding Delimited and Composable Control to a Production Programming Environment

Adding Delimited and Composable Control to a Production Programming Environment (add'l material), Matthew Flatt, Gang Yu, Robert Bruce Findler, Matthias Felleisen, ICFP 2007.

Operators for delimiting control and for capturing composable continuations litter the landscape of theoretical programming language research. Numerous papers explain their advantages, how the operators explain each other (or don’t), and other aspects of the operators’ existence. Production programming languages, however, do not support these operators, partly because their relationship to existing and demonstrably useful constructs—such as exceptions and dynamic binding—remains relatively unexplored. In this paper, we report on our effort of translating the theory of delimited and composable control into a viable implementation for a production system. The report shows how this effort involved a substantial design element, including work with a formal model, as well as significant practical exploration and engineering. The resulting version of PLT Scheme incorporates the expressive combination of delimited and composable control alongside dynamic-wind, dynamic binding, and exception handling. None of the additional operators subvert the intended benefits of existing control operators, so that programmers can freely mix and match control operators.

Another tour de force by the PLT folks. Does your language have delimited control, delimited dynamic binding, and exceptions? It's the new gold standard, and so far only Racket and O'Caml qualify (and maybe Haskell and Scala?)

Racket's implementation is additionally interesting because it achieves backwards compatibility with code written using undelimited call/cc and dynamic-wind. The authors mention that a simpler solution would be possible without this compatibility - based on control filters from the Subcontinuations paper.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Common Lisp

You can implement these concepts as libraries for Common Lisp: cl-cont provides delimited continuations, and ContextL provides delimited dynamic binding (among other things). The two libraries are compatible.

another data point, and reflections

Guile implemented delimited control that composes with delimited binding and exception handling, largely inspired by this paper, with the same considerations regarding dynamic-wind -- though Guile does not yet delimit call/cc.

But that question detracts from the loveliness of this paper. I actually liked its use of diagrams. It made the semantics more approachable to me.

Given the age of this paper, I would like to see more experience reports around delimited continuations -- for example, it seems that part of the PLT folks use serializable continuations, which do not use this mechanism. Are delimited continuations really good for web servers, for example, or do they promote an architecture that does not scale well? Do people tend to invoke continuations more than once, or just once? Has anyone implemented user-space threading with their delimited continuations? One would like to avoid invoking winders in that context.

As far as I know, delimited

As far as I know, (delimited) continuations are a failed experiment for web programming. They only help for applications that follow a pre-set control flow like the wizards you find in some Windows applications and check-out wizards for e-commerce, which is uncommon (and bad UI design).

Explicit continuation passing to multiple callbacks on a page could still work, though.

I'm not even aware that

I'm not even aware that delimited continuations have been tried in web scenarios, other than Oleg's proof of concept.

You're not tied to a static control flow either, which seems to be what you're implying.

Well, you're probably right

Well, you're probably right that they haven't been tried in practice. A quick search turned up this, and there is a lot of stuff about using undelimited continuations.

The problem is not that you're tied to a static control flow, the problem is that all of this rests on the premise that you want to write web applications like you write console applications. You ask for user input on pageview 1, then you ask for user input again on pageview 2, then you do some output on pageview 3, etc. This linear control flow is not how web applications work. It sure is possible to build websites this way by branching on the answer you get from each page, but what's the advantage? You can see this in Oleg's work, e.g. here in the blog example.

Instead of writing code in this call-answer style, it is much more natural to embrace the web instead of trying to keep programming like it's a 1970 terminal, for example by explicitly attaching a callback continuation to each form/link.

So you're arguing that we

So you're arguing that we should expose the continuations, and program in explicit CPS form, rather than hide the continuations. I agree that's probably the best approach for now, because there are all sorts of persistence and failure issues that you have to handle at exactly the point of creating/invoking continuations.

I wouldn't rule out the possibility that someone could devise a good runtime to handle these issues automatically though. I just don't think we really know how yet. Formlets is on the right track though.

It's not a runtime problem,

It's not a runtime problem, it's an expressiveness problem. Fundamentally a page does not have a single continuation. If you click on link X then you invoke a different continuation than if you click on link Y or submit form Z. So instead of wrapping all of these up into a single continuation that then dispatches on whether it was X, Y or Z that was clicked/submitted, it's better to just attach a different continuation to X, Y and Z. That also means that you don't need all the delimited/undelimited continuation stuff, you just need closures.

But why would the use of

But why would the use of direct-style continuation operators handle that differently from the explicit CPS style? I don't think they require you to use "a single continuation" and do the dispatch as you describe. The web server would push a prompt at the beginning of the user interaction with the server and, when an action requiring delaying/saving future interaction is encountered (eg. clicking on a link/form/whatever), save the current continuation to be ready to restore it after the delay or on backtracking.

If different such interactions happen in different possible executions, that correspond to different continuations being saved -- each corresponding to the local context of the interaction -- just like you pass different closures to such interaction points.

I don't think I understand

I don't think I understand what you're saying, or at least I don't understand how one would build a web application that way. To me it seems like one wants to suspend execution at the point where you send a page to the client, and then resume execution differently based on which link was clicked or which form was sent, thus either requiring explicit dispatch or dispatch based on having a different continuation for each link/form. Oleg's work does the former, I think the latter is better. The latter does not really benefit from direct style continuations machinery except in some special cases, since such machinery gives you just a single continuation. Can you give a simple example of what a web app would look like with a library such as you have in mind?

send/suspend/dispatch

I think what you're looking for is provided by the send/suspend/dispatch form in Racket's continuation-based web server. There's a paper about it here: http://repository.readscheme.org/ftp/papers/sw2003/WebUI.pdf

Yes, this paper does address

Yes, this paper does address exactly the problem that I have in mind. I do not think that's a good solution however, as it is unnecessarily complicated. You don't need delimited continuations or the send/suspend/dispatch form at all, you should instead just let embed/url be a global procedure that registers a closure in the dispatch table directly.

dispatch = {}

def embedUrl(f):
  id = generateRandomKey()
  dispatch[id] = f
  return "/my/site/" + id

def handleRequest(request):
   id = extractID(request)
   dispatch[id]()

This is all you need. See my Ruby toy implementation below. It also has some other nice stuff like the ability to read form inputs through lexical scoping without messing around with request parameters indexed by string keys manually.

Programming in CPS

Now you have to write all your handlers as top-level functions. In other words, you're programming in CPS. That's what the continuations avoid.

That is a good thing. You're

That is a good thing. You're programming in something more general than CPS, because a page has N continuations rather than 1. Which is exactly what you need for web sites. Web sites simply do not follow the stack discipline with 1 distinguished return point per page that continuations provide, except in very specialized cases (e.g. a multi-page e-commerce checkout process). And if your language supports delimited continuations you can still make use of them for those specialized cases*. But using continuations for everything and introducing a special construct for dispatching is not a good default. If you know of real world cases where delimited continuations help I'd be glad to hear them.

* For example for a checkout process you write the process like this:

def checkout():
  reset {
    data1 = page1()
    data2 = page2()
    data3 = page3()
    completeCheckout(data1,data2,data3)
  }

Then you let each pageN function capture the delimited continuation and have the form on the page push its data to the delimited continuation:

def page1():
  shift(k => 
    ...
    <form action={embedUrl(k)}>...</form>
    ...)

[I'd say this is bad UI design though; don't split the form across multiple pages, and even if you do you'll probably want the ability to go back to a previous page, and at this point the CPS form wins again]

[I'd say this is bad UI

[I'd say this is bad UI design though; don't split the form across multiple pages, and even if you do you'll probably want the ability to go back to a previous page, and at this point the CPS form wins again]

Two things here.

First, continuations should interleavable within components. E.g., I'd like a page that allows interleaving two interactions -- say a checkout sequence in one component and some sort of settings configuration in another. Seaside had some support there; I never quite wrapped my head around how to do it in other frameworks.

Second, the same linguistic abstraction should compile fine (in theory) to AJAX applications. Whether URLs or AJAX is used to handle the back button doesn't matter. I actually view this as a win for AJAX programming: right now, it's a pain to manually build in an AJAX back button. I agree that the non-AJAX case is increasingly an anachronism, but, again, that's arguably more of a compiler target question rather than a frontend linguistic one :)

I looked at Seaside and it

I looked at Seaside and it actually does a hybrid thing: attaching a callback to every link instead of 1 continuation per page, but at the same time supporting wizard/checkout processes (which I assume is implemented with call/cc the equivalent Smalltalk facility). For example look at the counter example. The HTML outputting is a little different though; instead off returning the HTML from the continuation in a functional style there seems to be a global HTML output handler that depends on global mutable state.

Agreed that the same abstraction should be able to compile down to both ordinary HTML links that go to an entirely different page and an AJAX callback that updates just the right parts of the page. One reason why the non-AJAX case is still very important is search engines. Another reason is that HTML5 pushState is not yet supported in Internet Explorer (it will be in version 10 though).

Agreed that the same

Agreed that the same abstraction should be able to compile down to both ordinary HTML links that go to an entirely different page and an AJAX callback that updates just the right parts of the page.

I've been thinking about this for -- oh -- 7 years now :) Language-level solutions here mean that the language can take care of exposing the hooks to the search engine. E.g., emit both the AJAX-y and the flat, and redirect / jump the deep link into the appropriate AJAX step. I don't think that's the right way to go about any of this, but it makes sense if you believe continuations / pages / etc. are the right model.

Interesting. Why do you

Interesting, I am also working on this (though it's just a hobby project). Why do you think that's not the right way to go about this? And what do you suggest as an alternative?

The higher-level issue is

The higher-level issue is that, when we're developing applications, we should be building them in a way that facilitates live meta-programming over them. The web is an amazing success here -- search engines, mashups, end-user programming (see Rob Miller's group @ MIT) -- while traditional application models (Haskell, C#, iPhone/Android SDKs) are 1-2 decades behind.

As an example, I made a browser extension a few years ago that lets you run a sloppy queries over a web application such as "change security permissions of grand canyon photos". It would build an interaction model of a web application as you surf (e.g., Flickr), and, from that inferred model, try to figure out how to map your command as a path through it. However, this is a very uncooperative way to build such applications -- part of the programming model should be to expose the information relevant for such tools.

If you look at the best papers from CHI/UIST over the past 5-10 years, they've been about *HCI* researchers saying that's how we want to use software. For them, SIRI etc. is no surprise. The DB and semantic web communities have been doing a bit in this area, but I haven't seen solid PL/SE research showing how to do it from first principles: continuations may be a part, but there's a lot more to do.

Just wanted to add: this is

Just wanted to add: this is exactly Don Sym is working on in F# with type providers and information rich programming.

You can do many of the same

You can do many of the same things to a lesser degree with existing desktop applications. Windows has an API for inspecting the widget tree. This is used for example by screen readers that turn a visual UI in a spoken language UI.

But indeed what people doing web applications have increasingly noted is that the UI presented to a human is not friendly to a computer. These days custom UI for computers aka APIs or web services are common. The way this is done in e.g Rails is that you use the same data model but different "UI" logic for the API. The same mechanism that is used to output different stuff for an API client and for a web browser, is also used to output different stuff for an AJAX request. A common way is to output a piece of Javascript code that will update the client side page to a new state. Perhaps all three use cases can be handled automatically by a single piece of code in a declarative way. The HTML output case and the AJAX case can definitely be handled by the same piece of code because the intended effect of the two is the same (though presumably faster with AJAX and retaining client UI state like scroll position and cursor position). But the point of the API is that is does something different than render a UI for a human. You can only generate the API automatically if it follows the same structure as the UI presented to humans (which may or may not be a good thing).

An alternative architecture is to build your client side UI as a Javascript application that talks to your API just like any other user of your API would. The problem with this is that it is not accessible to search engines (although it could be made that way by simulating the Javascript client on the server). I'm not at all sure what the best answer is here, or if it is even in this list.

Your scenario of N

Your scenario of N continuations hidden behind 1 front-end continuation sounds like it's assuming a particular model of continuation-based web programming. Explicit and implicit continuations are inter-convertible though. If you claim the problems you perceive are solved by an explicit CPS model, then take that model and reverse the CPS transform and you will have a direct-style continuation-based web library that achieves the same properties.

The only differences that remain are the greater difficulties of reasoning about continuation storage and lifetime in direct-style, ie. continuations tend to be opaque/abstract, so reflecting over them is often impossible. An explicit continuation model exposes closures and this is where we can parameterize the closure parameters by the functionality we need, ie. serializers, etc.

These are merely runtime issues though, like I said. If you could reflect on the structure of a captured continuation, you could do much of the same in a direct-style model.

Not all callback style

Not all callback style programs can be reverse CPS transformed. Very few in fact. What would you do with a term that has multiple continuation subterms? That is exactly what happens when you construct a representation for a page, since every link and form will have its own continuation.

This is the difference between the web and a terminal. On the web the control flow is best modeled as a tree, on the terminal it's best modeled as a straight line. The continuation machinery (like call/cc, shift, reset, etc.) works well for the straight line case as they are dealing with a single continuation. On the web they don't really help unless your control flow happens to be in a straight line like a checkout wizard. Then you can treat the continuation that goes to the next page implicitly via the machinery, perhaps making the code a little bit nicer. IMO this doesn't justify the costs associated with the machinery.

The single continuation is

The single continuation is merely a "wrapper" for the tree of continuations it encapsulates. The tree is still there, you just manipulate it via a uniform single parameter interface. This is little different to how single-parameter functions in OCaml and Haskell still support multi-parameter functions. The runtime manages the invocation of saturated multi-parameter function calls for you.

What would you do with a term that has multiple continuation subterms?

It's a continuation that encapsulates sub continuations. This is indeed messy for undelimited continuations, but delimited continuations make this very straightforward.

Re: very few callback style programs being unable to reverse CPS transform, I doubt very much that's the case for delimited continuations here. Keep in mind we're not dealing with arbitrary callback style programs, we're dealing with ones closely modelling an explicit continuation passing form. I don't see why this couldn't be transformed into a direct-style program using delimited control.

Can you give an example of

Can you give an example of how you'd program against the web programming library you have in mind?

Here is an example where I try to illustrate the difference between the two ways.

1. Single continuation that dispatches:

This is the style that Oleg uses.

We have a function page body cont that takes the body of a page and a continuation to be invoked when clicking any of its links or submitting any of its forms. Suppose for simplicity that a page is a list of links. A link is constructed with the link text id function. When clicked it will pass id to the continuation.

Now we can do this:

page [link "click me" "link1"; link "or click me" "link2"]
     (fun linkClicked -> match linkClicked with
                         | "link1" -> pageA
                         | "link2" -> pageB)

We can program this in direct style as follows:

let linkClicked = page [link "click me" "link1"; "or click me" "link2"]
match linkClicked with
| "link1" -> pageA
| "link2" -> pageB

This requires continuations machinery.

2. Dispatch by having multiple continuations:

We have a function page body that this time only takes a body argument. We have a function link text cont. When clicked it will use cont to display the linked page.

The same example with this style goes like this:

page [link "click me" (fun () -> pageA);
      link "or click me" (fun () -> pageB)]

I think this latter style is better suited to web programming, and as far as I can see you can't reverse-CPS this (as it is not in CPS). This has a less imperative feel, as you define the whole web site as a big (possibly infinite) lazy data structure. For forms the thunks/continuations do take arguments (namely the values in of the form fields).

----

Here is a simple proof of concept library + examples I coded up in Ruby some time ago. Don't expect anything, the library is a whopping 36 lines of code (and since it stores the closures in a hash table on the server it doesn't scale at all). The result is the ability to write some nice and short programs that would be much longer with the traditional web programming approach. For example this counter which displays the current count and links to increase/decrease it:

def counter(i)
  puts "The counter is #{i}"
  link("increase") { counter(i+1) }
  link("decrease") { counter(i-1) }
end

Started up with a call to counter:

counter(0)

And forms:

form do
  puts "Enter two numbers:"
  a = input()
  b = input()
  submit do
    puts "The sum is #{a.to_i + b.to_i}"
  end
end

This displays a form with two input fields and a submit button. Enter two numbers and click on the submit button to bring you to a page displaying the sum of those numbers.

No callcc's were (ab)used in this library, just closures and some nasty code to keep it short.

I think this latter style is

I think this latter style is better suited to web programming, and as far as I can see you can't reverse-CPS this (as it is not in CPS)

Your link example is using implicit continuations because you can capture local variables in the closure which your runtime must serialize and restore. A true explicit continuation approach would look more like this:

page [link "click me" (onclick karg:() k:(fun () -> pageA));
      link "or click me" (onclick karg:() k:(fun () -> pageB))]

Where onclick is as follows:

val onclick: karg:'a -> k:('a -> page) -> cont

You can see how you have to explicitly list the parameters to the embedded closure which all gets wrapped up in cont. The closure must not capture an environment. This is exactly the approach I'm taking in my UI library since C# doesn't support continuations.

In direct-style using delimited control, you don't need onclick at all because link can implicitly extract the captured variables up to the enclosing page call. What you're objecting to is Oleg's particular use of continuations, because instead of exposing them as values in the UI, he's encapsulating them. But like I said, exposing them as UI values requires runtime capabilties which aren't too common, so explicit continuation form is the best way to achieve this for now.

Can you explain why

Can you explain why delimited continuation support helps here? You'll run into the same problem if your delimited continuations are not serializable. Furthermore, if one of the local variables of the captured continuation is itself a closure, you run into the problem of serializing closures again.

I think the issue here is serialization of closures, which as far as I can tell has nothing to do with continuations (delimited or otherwise). Indeed you can solve this by manually doing closure conversion as you suggest. [Ruby has native support for serialization of closures, although in the toy implementation I'm not using that because I keep the closures in memory]

I think there are two

I think there are two separate concerns that are being conflated here: business logic and presentation. Your link example addresses the latter, but not the former. I agree with you that representing UI control flow in this manner is ideal, but native continuations don't really show up in this navigable projection of the program.

The business logic concern is where native continuations show up. Consider a checkout process on Amazon. You load a series of N pages each of which gathers a fragment of validated data, and the final page contains the order submission. If represented using your approach, this checkout process is in CPS form, and the business logic for this process is scattered across the N pages.

Ideally, the checkout process logic should be in direct-style, regardless of how the UI may be presented. Any sequence of 1 to N pages should be usable for the same business logic process, as long as they bind the same M variables. This can be achieved via continuations, though Oleg doesn't take that extra step of fully decoupling the UI.

I don't see a simple way of simulating this without using CPS form, or something like continuations. For instance, there's a C# web framework that uses enumerators for this (can't find the link now, sorry).

Yes, I agree that

Yes, I agree that continuations support can potentially be useful for certain kinds of web page control flow. In fact that's how this thread started ;)

[Continuations] only help for applications that follow a pre-set control flow like the wizards you find in some Windows applications and check-out wizards for e-commerce, which is uncommon (and bad UI design).

So I think it's not worth the trouble. And even if you do use continuation support then the framework should be built around one closure per link/form, and not around a single continuation where you then have to manually dispatch based on the link that was clicked (that is just building something simple out of something complex instead of doing the simple thing in the first place).

So I think it's not worth

So I think it's not worth the trouble. [because it's uncommon and/or bad UI design]

Then perhaps I don't understand your proposal. Assume we have a process. This process involves accumulating data up to a point, after which the process results are committed. The user can backtrack this process or abandon it at any time, up until commit.

Continuations are the only solution I'm aware of to achieve these properties. You either write this process in manual CPS form, or you write it as a direct-style function using native continuations.

You imply some aspect of this scenario is uncommon or bad design, but the above description doesn't seem uncommon to me, and the bad design isn't clear. Can you elaborate?

Perhaps you can give a real

Perhaps you can give a real world example of such a process where code using direct-style using native continuations looks better? Even the seemingly perfect example of the Amazon checkout process doesn't work well anymore once you realize that you also want the ability to go back to a previous step. I think having a multi page checkout process is bad design in the first place (just put the whole form on one page), and very bad design if you don't have a way to go back to the previous page.

Have you looked at the papers on Scheme on using delimited continuations or the work of Oleg? (here) They make direct style the default, which means that you have to manually dispatch on which link was clicked after every page, and this makes your code look like this code that I posted before:

let linkClicked = page [link "click me" "link1"; "or click me" "link2"]
match linkClicked with
| "link1" -> pageA
| "link2" -> pageB

I'm just saying this is a bad default, and that having a different continuation for each link is better (and simpler). As you can see here it's still possible to use delimited continuations when that makes your code look nicer even if you let the explicitly CPS'ed version be the default.

Even the seemingly perfect

Even the seemingly perfect example of the Amazon checkout process doesn't work well anymore once you realize that you also want the ability to go back to a previous step.

Allowing backtracking is the whole reason web programs use continuations. I'm not sure why you think continuations wouldn't allow it. Oleg's presentation that you linked earlier talks about it. Did you mean something else?

They make direct style the default, which means that you have to manually dispatch on which link was clicked after every page, and this makes your code look like this code that I posted before:

Again, this isn't a consequence of direct style, this is a consequence of their UI language. Their UI language doesn't include continuations as a first-class value, so they simulate it. The 'dispatch' from send/suspend/dispatch one way to add continuations to the UI language.

As you can see here it's still possible to use delimited continuations when that makes your code look nicer even if you let the explicitly CPS'ed version be the default.

That's almost exactly what I've been suggesting, except checkout() wouldn't have any notion of "pages". You would just have some UI.Request(), and UI.Request() and some dependency injection would resolve the appropriate page for you.

Oleg's continuation-based blog code would look almost exactly as it does now, but because the UI language now supports continuations, it would look like what you're trying to achieve as well.

Allowing backtracking is the

Allowing backtracking is the whole reason web programs use continuations.

It's not really backtracking since you want to preserve the data entered on page N even if you go back to page N-1. I'm not saying that it can't be done, just that at that point the continuation support doesn't win over explicitly passing two different continuations to the next & previous links.

Again, this isn't a consequence of direct style, this is a consequence of their UI language. Their UI language doesn't include continuations as a first-class value, so they simulate it. The 'dispatch' from send/suspend/dispatch one way to add continuations to the UI language.

I don't understand what you mean here. The whole point of direct style is that you can write code like this:

let answer = sendPageFooToClient()
... do something with answer ...

But this assumes that after visiting PageFoo there is just a single continuation, which for most pages is not the case. Hence you need to dispatch based on which link was clicked and do the appropriate action for each.

That's almost exactly what I've been suggesting, except checkout() wouldn't have any notion of "pages". You would just have some UI.Request(), and UI.Request() and some dependency injection would resolve the appropriate page for you.
Oleg's continuation-based blog code would look almost exactly as it does now, but because the UI language now supports continuations, it would look like what you're trying to achieve as well.

I realize that I don't understand what you have in mind at all. Can you give a complete code example of what you have in mind? How would you write the checkout process? Or do you just mean that you don't need to put each step in its own function? (which I of course agree with; the same applies with my example: you can inline page1-3 into the checkout() function -- you can also rename shift to UI.Request if you wanted to).

Just realized I still had

Just realized I still had this comment in a background tab. ;-)

It's not really backtracking since you want to preserve the data entered on page N even if you go back to page N-1.

It is backtracking, but the value you want preserve is a reference, eg. like an ML 'a ref. Backtracking does not restore ref values.

Re:code sample: the checkout process is simply decoupled from the UI to which it's bound. For instance, checkout() issues various commands which are routed to UIs that satisfy those commands. Since I have C# on the brain, something like:

public IEnumerable<IContinuation> Checkout(ShoppingCart cart)
{
  UI<Address> addr = UI.Resolve<Address>(cart.Account);
  yield return addr;
  // addr is now fully populated
  UI<Address> payment = UI.Resolve<Payment>(cart.Account, addr.Value);
  yield return payment;
  // payment is now fully populated
  payment.Bill(cart);
  cart.ShipTo(addr);
  yield return UI.ContinueWith(new ThankYouForShopping());
}

The UI framework is in control, and only calls this function when an action registered to this continuation is invoked. So you get your model of multiple continuations per page, but you still have Oleg's model of single control-flow via delimited control.

To model this using only continuations, each "yield" above would have to be a new continuation object accepting parameters for the previous state. Not a huge deal, but you can imagine that language support can make this quite nice.


Web applications

The reason why continuations were considered in web applications is because of the curiosity of the "back" button, which doesn't make that much sense in applications in the first place. Applications should be based on what is easy to use, not what curiosities are part of an only partially fitting deployment platform. MVC is still one of the best ways to build GUIs, the web hasn't changed that...

OT i wish i understood the debate there better

"The first rule of partitioning control is you do not make controller objects. The second rule of partitioning control is you do not make controller objects." vs. MVC and more recently DCI.