Understanding continuations

In my neverending quest to grasp continuations (stupidly, apart from actually trying to use them in a supporting language), I found this "continuation sandwich" example interesting.

I wrote down my thoughts on what I still don't understand. If anyone would be willing to comment there or here and help me to correct my still-confused understanding of continuations I'd be very grateful.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

No, no, no! ;)

My understanding of continuations stands as follows: With closures, you create a function that wraps any lexicals in scope at the time which then travel along with the function wherever it goes. A continuation is similar, except rather than operating within the scope of a function, it operates on the scope of the entire program, wrapping all program state and carrying it around with it.

A continuation *is* a closure - there's no difference, conceptually (there may be implementation-level differences, which essentially reflect that a continuation is a closure which has been created in a different way than most closures).

The things that tend to mess people up when thinking about continuations are (a) not fully understanding closures in the first place; (b) the behavior of what Scheme calls call/cc, which wraps up the "current continuation" as a closure; and (c) the idea that functions don't have to return to the place that they're called from, which is fundamental to the notion of continuation.

So, to understand continuations, assuming you really understand closures, the next thing you have to do is abandon the idea of functions that automatically return to their caller, because that's a mental barrier which will trip you up repeatedly. Instead, think in terms of continuation-passing style, in which every function is called with an "extra" parameter, its continuation, and it calls that continuation when it's done. Remember, that continuation is just a closure, i.e. a procedure - there really isn't anything special about it that matters. When a function is done, it calls its continuation, which is just another function.

Note that even imperative languages, like C or BASIC or Java, work like this, just in a very restricted form, in which the contination passed to every function is always a "return continuation", which invokes the continuation at the point of the call which invoked the function in question. In these continuation-challenged languages, the return continuation usually has limitations on what parameters it can take, i.e. how many or what kind of values it can "return". In a fully general continuation model, a continuation is just like any other procedure and can accept as many arguments of whatever type that any other procedure can accept.

As for call/cc, it's easy: all it does is turn the current continuation - i.e. the continuation at the point in the program where call/cc is called - into a closure. There really isn't much more to it than that.

The rest is figuring out what you can do with continuations, to which the answer is, anything that involves control flow: build threads, engines, coroutines, do goal-seeking, backtracking, etc. That's where some of the confusing stuff really starts, because once you start messing with control flow, it's easy to get confused.

To demonstrate that, I'll close with a sig I've sometimes used on c.l.scheme, which is only useful for making continuations seem impenetrable to the uninitiated. It should run under any R5RS Scheme:

; breaking the Perl lock on cryptic .sigs:
(list->string(call-with-current-continuation(lambda(x)(let*((z(map(lambda
(c)(reverse(cons c(call-with-current-continuation(lambda(k)(list k 0))))))
(string->list "patsonm")))(i(apply max(map car z)))(cs(cddr(assoc i z)))(xs 
(append(string->list(case i((4)"ul")((5)"@")((6)"i")((12)"c.")((15)(x cs))
(else "")))cs))(k(cadr(list-ref(map(lambda(i)(list-ref z i))(list 3 0 4 1 
2 5 4 4 5 2 3 5 4 1 6))i))))(k `(,@xs ,k ,(+ i 1)))))))

No, no, no! to no, no, no!

A continuation *is* a closure - there's no difference, conceptually

I don't agree. I think there are a few reasons that this is wrong:

  1. Firstly, say in scheme, closures are not part of the language, but a way to understand the semantics of the language. It is possible to implement scheme without anything there being anything in the implementation that can obviously be described as a closure, eg. in Alan Bawden's implementation using linear graph grammars.
  2. Secondly, in scheme the value passed to its argument by call/cc is a procedure that may be utilised in an application. Conceptually a closure is not a procedure, but the way the environment is presented to a procedure. Thus in implementations where it makes sense to understand continuations in terms of closures, it is better to say that a continuation encapsulates the closure in a procedure so that it may be applied...
  3. ...and then, lastly, what it means to pass a value to a closure still needs to be explained. While it is possible to understand "what will be done with the value on termination" as part of the closure that is, your extra parameter, a kind of special slot in the closure, I don't think that that is part of the ordinary meaning of closure.

Yes, yes, yes!

A continuation *is* a closure - there's no difference, conceptually
I don't agree. I think there are a few reasons that this is wrong:
It's not wrong. I may have oversimplified slightly, but the point was to counteract the idea that a continuation "operates on the scope of the entire program, wrapping all program state and carrying it around with it". A continuation is a (lexical) closure, in the sense that it closes over the lexical environment at the time of its creation, in exactly the same way that an ordinary lexical closure does. That is the main sense in which I meant what I wrote. However, with "there's no difference, conceptually", I should have been more specific: there's no difference in the way that a continuation closes over its lexical environment, and that's what's important regarding the point I was responding to.
1. Firstly, say in scheme, closures are not part of the language, but a way to understand the semantics of the language. It is possible to implement scheme without anything there being anything in the implementation that can obviously be described as a closure, eg. in Alan Bawden's implementation using linear graph grammars.
I don't understand your point, although I'm not familiar with the details of the Alan Bawden implementation you mention (and the MIT AI FTP server is refusing anonymous logins ATM). The Scheme expression (lambda (x) (lambda (y) x)), when applied to a single value, is required to return a value which, however else you describe it, is a closure, which closes over the lexical binding of x (modulo compiler optimizations which could be eliminated with a more complex example). You can say that this value encapsulates a closure, if you like, but I'm using "is a" in the taxonomic, OO sense, i.e. you could also say that the result is a "kind of" closure.
2. Secondly, in scheme the value passed to its argument by call/cc is a procedure that may be utilised in an application. Conceptually a closure is not a procedure, but the way the environment is presented to a procedure.
I should have been clearer in my post about the distinction between the concepts of closures and continuations, and their representation as first-class values in a language. In languages which support first-class lexical closures, like Scheme, ML, Haskell, Javascript, Perl, etc., a first-class closure is represented in the language as a procedure. In such languages, at the level of first-class values, a closure *is* a procedure. I agree that conceptually, there's more to it, but I don't think that's particularly important for the point I was making. See next point.
Thus in implementations where it makes sense to understand continuations in terms of closures, it is better to say that a continuation encapsulates the closure in a procedure so that it may be applied...
By my point above, I don't agree: in languages which have closures as first-class values, closures are encapsulated as procedures anyway, so there's nothing special about continuations in this respect, which is precisely my point.
3. ...and then, lastly, what it means to pass a value to a closure still needs to be explained.
Again, that's a feature of the representation of first-class closures as procedures in languages which do this.
While it is possible to understand "what will be done with the value on termination" as part of the closure that is, your extra parameter, a kind of special slot in the closure, I don't think that that is part of the ordinary meaning of closure.

You're referring to my point about continuation-passing style. I agree, some clarification is needed, although in a slightly different area (the point about the ordinary meaning of closure is already dealt with above). The point that needs to be clarified is that if one treats all procedures as taking an extra, implicit continuation parameter, then there is a distinction between a first-class continuation and an "ordinary" first-class closure: the former discards the continuation parameter it is provided with, whereas the latter (usually) does not. This results in the behavior of continuations as "jumps".

With the various clarifications above, I'll restate what I was getting at:

  • Continuations close over the lexical environment at the time of their creation.
  • Lexical closures close over the lexical environment at the time of their creation.
  • In most languages which support them, first-class lexical closures are represented as, or packaged in, procedures.
  • In languages which support them, first-class continuations are represented as, or packaged in, procedures.
  • To understand continuations, it helps to think in terms of continuation-passing style, in which all procedures take an extra parameter, their continuation. In this model, the only difference between a first-class lexical closure, and a first-class continuation, is that when applied, the latter discards its continuation parameter. There's no difference in terms of their closure over the lexical environment in which they were created. The only difference is in terms of their packaging as procedures.

first-class continuations

IMO you're not using "first-class continuations" accurately. If you want to see syntactically first-class continuations, look at Filinski's symmetric lambda calculus. Expressions deliver values. Continuations receive values.

CPS is too implementation-oriented to help define continuations. Scheme's continuations appear as functions that never return, which is a good way to understand continuations, but obscures the degree to which expressions and continuations complement one another.

Re: first-class continuations

I wasn't trying to "define" continuations, but to explain them, without resorting to formalisms that would be entirely unfamiliar to the OP, and also without resorting to hardware-oriented explanations. The kind of first-class continuations encountered in Scheme, Stackless Python, Javascript, and (allegedly) Parrot are not syntactic, and those are most relevant in this context, IMO.

I think Dave Herman hit on an important point with his enumeration of some of the many perspectives on continuations, at the end of his "pedagogy" post. The only safe way to make specific claims about continuations is within the context of a specific formalism, which doesn't help much in explaining them, in any detail, within the context of mainstream languages.

Of course, one can attempt to explain the general concept of continuations, but as I've written in another post, achieving understanding depends on mapping to something that's already understood. If what's already understood centers around the semantics of mainstream languages, that constrains the possible explanations.

Re: first-class continuations

Not alleged--they're in there in Parrot, for better or worse. (Better conceptually and for safety, worse for performance, but you gotta take the bad with the good) You can't make a function or method call without 'em, since we've gone fully CPS.

Nobody, to date, has died from the enforced exposure to continuations. I expect that can be taken either as a sign they're not toxic, or that we've hidden them sufficiently as to make them invisible in the common case.

Thanks

Sorry, that was "alleged" in the sense of "don't know for sure, and am too lazy to check"... ;)

Nobody, to date, has died from the enforced exposure to continuations. I expect that can be taken either as a sign they're not toxic, or that we've hidden them sufficiently as to make them invisible in the common case.

Well, I should hope they'd be invisible in the common case... But anyway, well done! It'll be really interesting to see how their presence plays out in practice, in a larger programming community.

Continuations and closures - definition in CTM

A continuation *is* a closure - there's no difference, conceptually

This is confusing, since there is definitely a conceptual difference. In the terminology of CTM, a closure is a procedure value (procedure definition + contextual environment) and a continuation is a semantic stack ('what remains to be executed'). Very simple; simple enough so that the formal definition can be given to second-year students!

Making a continuation first-class means to 'freeze' a semantic stack so that it does not execute, and whenever the continuation is invoked, an executable copy of the frozen stack is created.

Continuations for the rest of us

I agree that simply saying "there's no difference" is confusing. I addressed this point in my post above entitled "Yes, yes, yes!". I wrote:

the point was to counteract the idea that a continuation "operates on the scope of the entire program, wrapping all program state and carrying it around with it". A continuation is a (lexical) closure, in the sense that it closes over the lexical environment at the time of its creation, in exactly the same way that an ordinary lexical closure does. That is the main sense in which I meant what I wrote. However, with "there's no difference, conceptually", I should have been more specific: there's no difference in the way that a continuation closes over its lexical environment, and that's what's important regarding the point I was responding to.

As for the CTM definition you describe, I don't doubt its efficacy in the context of CTM or a formal course, but it seems to me that it fails to directly explain the exact point I was trying to make. In practice, "what remains to be executed" tends to be quite confusing to programmers in a non-formal context, and is precisely one of the points which leads to the notion that continuations "wrap all program state", which was the point I was responding to.

If a programmer understands closures, then capturing a continuation can reasonably be described as converting the execution state at a given point into a closure. This is the most effective way I've found to communicate the intuition behind first-class continuations, in an informal context. There are various caveats - e.g. continuations don't return when applied - but these are minor compared to the misconceptions which tend to arise when people start talking about "the rest of the computation" and the like.

Give the intuition, but don't sacrifice truth

capturing a continuation can reasonably be described as converting the execution state at a given point into a closure

Be careful when using precise concepts in a fuzzy way or for the sake of intuition. Formally, a continuation is definitely not the same as a closure. Calling it a closure for the sake of 'intuition' is a bad idea, because it will only lead to further confusion when people try to understand things better.

There is a simple rule that I try to follow when explaining things intuitively: every sentence should express a truth when the concepts are taken according to their precise definition. That is, when you understand things better, all previous sentences still remain true! This is a very attractive property. We tried hard to follow this rule throughout CTM.

In this case, it seems that what you are trying to say is that the execution state is reified into a callable entity. This is only saying a little, but at least it is saying the truth.

No sacrifice necessary

In this case, it seems that what you are trying to say is that the execution state is reified into a callable entity. This is only saying a little, but at least it is saying the truth.

This misses the main point of what I was saying, and fails to address the misconception that I was attempting to address.

The misconception - which is a common one amongst programmers trying to understand continuations - is that capturing a continuation saves the entire state of the program, including both environment & store. This misconception often arises due to explanations that talk about "the rest of the computation", or as you put it, "what remains to be executed".

To explain what really happens, I believe it is valid and useful to point out that a first-class continuation captures its lexical environment (and not its store), just as a lexical closure does. This exploits an existing understanding of lexical closure to communicate quite effectively which aspects of program state a continuation does and doesn't capture.

In addition, as you put it, "the execution state is reified into a callable entity". So now our first-class continuation shares two rather major characteristics with first-class lexical closures: it closes over a lexical environment, and it is a callable entity. From the point of view of a programmer using them in a program, the only observable difference between the two is that the continuation doesn't return to its caller when applied.

As I've said, I've found this to be an effective explanation of how first-class continuations work, as a programming construct, as opposed to a formal concept. I also don't believe that it is a particularly fuzzy explanation, or that it sacrifices truth. In fact, it could be made into quite a precise explanation, relative to a language implementation which uses CPS to implement first-class continuations. (BTW, continuations in CPS are sometimes referred to as "continuation closures".)

Earlier in this thread, Scott Turner objected that "CPS is too implementation-oriented to help define continuations". However, my response was and still is that I'm not attempting to define continuations as a formal construct, but rather to explain them as a programming construct in terms of concepts already familiar to practicing programmers.

A little bit of formalism goes a long way

my response was and still is that I'm not attempting to define continuations as a formal construct, but rather to explain them as a programming construct in terms of concepts already familiar to practicing programmers.

The two are not mutually exclusive. The abstract machine in CTM is fully formal yet it uses only concepts already familiar to practicing programmers. I prefer to give this formal definition rather than try to explain things in natural language.

BTW, I am not "missing the main point of what you are saying". I just think that the misconception you point out is aggravated by fuzzy explanations. I think the only way to put the misconception to rest is to give the formal definition. I know how to give an extremely simple formal definition! Programmers who are afraid of such a simple definition either have to get over their fears or be resigned to be mediocre.

...but not all the way

I prefer to give this formal definition rather than try to explain things in natural language.

Frank said something of the sort the other day, in the most recent types discussion. It's a common and understandable formalist position, but it can also be a huge cop-out, if you either cannot, or are not willing to, explain the mapping of a formal concept into some less formal context -- or in the case of mainstream programming languages, into an essentially formal but much more complex context.

BTW, I am not "missing the main point of what you are saying".

I wrote "This misses the main point of what I was saying" in response to a statement of yours which purported to encapsulate what I was trying to say. Your statement did indeed miss the main point of what I was saying. IOW, you mischaracterized my position, and I responded to that.

I know how to give an extremely simple formal definition! Programmers who are afraid of such a simple definition either have to get over their fears or be resigned to be mediocre.

It's not necessarily a matter of "fears". It's more a question of pragmatism, for people who aren't full-time students and don't have a formal CS background. Learning is a costly activity, in terms of time and brain bandwidth, and the common choice to avoid expending significant effort on learning something new is often a very rational one. As I've said, I'm not arguing about the best way to communicate something to e.g. the second year students you mentioned earlier. By second year, they ought to have a solid basis for quickly absorbing a simple formal definition. I'm talking about an explanation that doesn't require significant prior study.

If the CTM description does indeed "use only concepts already familiar to practicing programmers", and if readers can jump to the relevant page and begin reading without having worked through a few hundred prior pages, then that's great. However, in online discussions, responding to a question with "get this book & study it" isn't always considered responsive.

Fuzzy explanations lead only to confusion

I believe it is valid and useful to point out that a first-class continuation captures its lexical environment

So what is its environment? Can you point it out to me in the source text? I think not. It depends on the execution path up to the point where the continuation is created. The adjective "lexical" does not fit. Forcing a continuation to be a closure is just plain wrong. Giving the formal definition of both makes this very clear.

One man's fuzzy...

I believe it is valid and useful to point out that a first-class continuation captures its lexical environment
So what is its environment? Can you point it out to me in the source text? I think not.

Sure I can. To see this clearly, perform the trivial rewrite into CPS at the point at which a continuation is to be captured. I'm referring to the lexical environment of the resulting continuation procedure, which certainly is visible in the source text. It's also easy enough to see this in the source text without the rewrite.

It depends on the execution path up to the point where the continuation is created.

I'm saying that for explanatory purposes, it can be useful to treat the lexical environment, at the point where the continuation is captured, separately from the "rest of the control stack". The reason for this is that someone who is already familiar with the behavior of first-class lexical closures, and e.g. their semantics when passed "upwards" (which also involves capture of part of the control stack), can readily transfer this knowledge into an understanding of continuations, if the connection is pointed out to them.

Note that in my first post above, my description involved thinking "in terms of continuation-passing style, in which every function is called with an 'extra' parameter, its continuation, and it calls that continuation when it's done." If you take this into account along with the continuation-as-closure perspective, it takes care of the control stack issue. In principle, there's nothing fuzzy or imprecise about it (although clearly my treatment of it in this thread needs work).

The adjective "lexical" does not fit.

The adjective is a precise fit for what I'm applying it to.

Forcing a continuation to be a closure is just plain wrong.

Then how would you explain the fact that in the usual CPS model, continuations are implemented as first-class lexical closures? Perhaps you'd agree that in that CPS model, it's OK to describe a continuation as a closure, with the appropriate caveats to distinguish these first-class values from the abstractions of which they are reifications. Now, consider a CPS-based compiler: under the hood, its continuations are closures in the same sense. But we can also use this model to understand the behavior of first-class continuations in a language, even if an actual implementation happens not to use CPS to implement continuations.

"trivial" is in the eye of the beholder

Sure I can. To see this clearly, perform the trivial rewrite into CPS at the point at which a continuation is to be captured. I'm referring to the lexical environment of the resulting continuation procedure, which certainly is visible in the source text.

I doubt that most programmers would consider the rewrite "trivial" according to the dictionary meaning of the word (not the mathematician's sense of "I understand it; do you?"). Remark that the resulting continuation procedure is only part of the story: it contains a reference to another continuation procedure, etc. The continuation's complete environment consists of the union of all the environments of these continuation procedures, which cannot in general be known statically (as I said). So I stand by my previous comments: continuations are not closures. As both your CPS example and CTM show, a continuation is more like a stack of closures. In my experience, CTM's semantics is much easier to understand than CPS semantics for a practicing programmer.

Anyway, let's just agree to differ. I am leaving on holidays tomorrow and will be off-line for several weeks; I wish you a good holiday season!

Un-sacrificing the truth

So I stand by my previous comments: continuations are not closures. As both your CPS example and CTM show, a continuation is more like a stack of closures.

Both claims are true: first-class continuations can be implemented as lexical closures, and in that context, continuations certainly are closures; but since those closures contain a reference to a parent continuation, they can also be viewed as a stack of closures. That doesn't detract from the fact that they're closures.

This simply shows that understanding first class continuations in terms of first-class lexical closures qualifies according to the criterion that intuitive explanations must "express a truth when the concepts are taken according to their precise definition".

Continuations are not functions

A continuation *is* a closure

This statement is even less true than “an immutable pair (x,y) is just a function λf. f x y”.

It is true that both can be exposed as functions in some programming language. Just like any other type, since lambda calculus is Turing complete. But conceptually they don't have to be unified if language designers choose so.

Moreover, while pairs could be implemented as functions of the same language they live in, losing only the possibility to distinguish them from other functions, continuations cannot be implemented as functions, because the mechanism of function application in a language supporting first-class continuations already uses continuations implicitly.

Continuations can at most be functions of the host language our language is implemented in terms of. They live in a different world than functions of the language we are implementing, they don't take the current continuation as an implicit parameter. CPS transformation is not idempotent.

Fast-motion replay

I agree that that statement is oversimplified, and too strong. The gory details of what I was getting at are in the ensuing subthread.

To summarize, the point is that if you already understand closures, one potentially quick way to grasp the intuition behind first-class continuations is to look at how their semantics can be understood in terms of the semantics of closures. In a language in which the implementation of first-class continuations relies on CPS in the underlying implementation, the continuations exposed to the programmer are in fact just closures[*]. Understanding continuations in those terms is essentially equivalent to understanding continuations in terms of the semantics of an interpreter, which is a very respectable way to learn about them.

[*] Yes, they're closures in the host language, but what I wrote was intended as a starting point for developing an intuition, for someone who's trying to understand them as a programmer, as opposed to a student of semantics. Something more like my summary in the previous paragraph wouldn't work as part of such an explanation, because it gets too complicated too fast. So I "lied" as one often does when attempting to explain something. This particular lie was not as carefully worked out as the ones found in a college course, but IMO all it really requires to fix it are some suitable qualifiers. I believe the approach is sound.

C/Assembler approach

Here's how I would explain continuations to people with a C/assembler background, to a first approximation:

A continuation is a data structure of: a stack, a set of CPU register contents, and a program counter. It's a partial snapshot of a program. You create a continuation by sucking this information out of the machine, i.e. grabbing registers and making a complete copy of the stack. Then at any time in the future you could decide to invoke (or "reify") a continuation by copying its stack over the current one and restoring all of its register values. Then you have jumped back to what you were doing when the continuation was captured. It's like a time-machine except that it doesn't affect main memory or external devices, only the stack and registers (i.e. control flow.)

This is just like setjmp/longjmp in C except that you save a full copy of the stack instead of just a pointer. So while longjmp requires you to jump upwards on the stack, invoking a continuation has no such restriction because it's got its own stack that you'll copy over the current one.

That's all! Now you can use this mechanism to do lots and lots of things: coroutines, threads, Prolog-style backtracking, etc. It's great fun to play with this in Scheme.

That hopefully explains what continuations are and what they're useful for, but don't draw any conclusions about performance. In practice people use clever tricks to avoid actually copying whole stacks all the time.

That's for C/assembler. If you know some Lisp then perhaps Guy Steele said it best.

I hope that is illuminating to someone.

Re: C/Assembler approach

Excellent explanation, Luke! I sometimes wonder if academic computer science is intentionally trying to obfuscate things :) So many complex seeming things asre so simple if you view them in terms of how you'd implement them on real hardware: continuations, monads, etc.

Monads

Monads on bare metal? I'd like to hear about that!

A lot of people have patiently tried to explain Monads to me but I've never understood.

Monads

I would just say it is a particular ADT and write down its signature and algebraic laws (there are only three); then I would give a diversity of examples, including at least the list monad and the state monad. I can reverse that order if the audience wants.

Now I comment on a popular way of introduction: that monads are for states. Its problem is that the list monad has no state.

You may then suggest that my ADT way has the dual problem: that a command of type, say, IO Int, is not a datum like lists.

Oh, but it is a datum; we are into higher-order programming. Programs are data too. A value of type IO Int is a datum that can be executed to do a lot of I/O and produce a number.

The advantage of the ADT way is that everyone already knows what to expect of an ADT.

Explaining monads

I would just say it is a particular ADT and write down its signature and algebraic laws (there are only three)

I agree. One of the confusing things about monads is that most explanations go along the lines of "Pure languages have no state and no side effects. The solution is to use monads. Monads are a datatype." This is very confusing.

In fact, I think it's best to separate the explanation of monadic programming in haskell (along with the do syntactic sugar) from the explanation of monads as a structuring and modularization technique. The latter can be shown using other languages, when more convinient, reducing the fear the algebraic laws tend to evoke...

Please do

I'd sincerely like to understand what Monads are and how to use them in programs. Would you (or anyone) please briefly explain them in terms that you think I'll understand?

one link says more...

Monads are structures that enforce patterns.

Do you want to enforce a particular order of evaluation? you can take advantage of parameter passing (e.g. f(g(x))) which is a way to enforce that 'g' will be executed before 'f'. This pattern can be encapsulated in a monad (for example the IO monad in Haskell), i.e. a structure which allows to reuse the pattern with different types of data.

Since the sequence pattern is so important, Haskell provides the do notation (do a <- g(x); b <- f(a) etc) which is used for writing sequences of operations; remember that in functional programming, there is no order of operations, but in real life, order of operations matters.

If you understand what monads are now from this simple and crude explanation, you can then proceed with the more formal explanations :-).

Unfortunately that simple

Unfortunately that simple and crude explanation misses some of the point - indeed, some monads don't actually enforce anything more on the order of evaluation than the host language does, instead merely providing a notion of dependence between computations.

Not so fast

Excellent explanation, Luke! I sometimes wonder if academic computer science is intentionally trying to obfuscate things :) So many complex seeming things asre so simple if you view them in terms of how you'd implement them on real hardware: continuations, monads, etc.
I think this can best be answered with another quote, by Edsger Dijkstra:
Computer Science is no more about computers than astronomy is about telescopes.

There are at least two reasons that these "complex seeming things" seem simpler to you when viewed in terms of how you'd implement them on real hardware:

  • Your mental model of computation happens [to be] based on that real hardware. Anyone's understanding of anything is only achieved by mapping it to something else they already understand.
  • Mapping from an abstract model of computation to a concrete model, such as real physical machines, is quite straightforward, precisely because the models are abstract, and simply have to be appropriately "fixed" for the concrete model in question. Doing this the other way around is much more difficult, because you have to recapture the generality that is lost when you go from abstract/general to concrete/specific. IOW, you're doing things the hard way, starting with an understanding of a concrete model and trying to understand general concepts in those terms.

Academic computer science deals with abstract, formal notions of computation. Computation is something can be, and is, done independently of what you call "real hardware", and the proofs which computer science produces are not affected by whether you're using an Intel processor, a PowerPC, or pencil and paper in conjunction with your brain.

Naturally, if you're not familiar with the usual formalizations of the abstract building blocks of computation, then explanations in those terms won't make much sense to you.

Far from "intentionally trying to obfuscate things", academic computer science in fact *simplifies* things: it formalizes concepts of computation as simply as possible, without resorting for explanation to enormously complex machines such as a modern CPU. As in any science, and in programming, the simplest explanations tend to win out, and computer science is no different.

For example, the lambda calculus -- which is quite on topic in this case, since LC could be described as a formal theory of lexical closure -- requires only three or four simple grammar rules, and three reduction rules, to completely capture the notion of computation and computability - it's far simpler to learn than the intricacies of any CPU. Because of that, it's a good place for anyone interested in learning about computer science to start - it's very simple at the introductory level, the web overflows with material about it, and with a basic understanding of it, a lot of other things will make much more sense -- including continuations, and monads. Try some of the links at the bottom of this page for starters.

bare metal

Much after the fact I've just come across a Knuth quote (well, abstract of a talk) to see Anton's Dijkstra with. Bottom-up education:

People who discover the power and beauty of high-level, abstract ideas often make the mistake of believing that concrete ideas at lower levels are relatively worthless and might as well be forgotten. The speaker will argue that, on the contrary, the best computer scientists are thoroughly grounded in basic concepts of how computers actually work, and indeed that the essence of computer science is an ability to understand many levels of abstraction simultaneously. Therefore he has put considerable effort into the design of a RISC machine called MMIX, as an aid to computer science educators. MMIX is intended to be simple and clean yet realistic. Many tools have been built to simulate the MMIX architecture, and more are under development.

And y'know, the other week me and some colleagues were hacking late and talking about the "good old days" and I reflected that even kids can pick up advanced programming techniques if they're presented in the right way. When I was a 14-year-old Amiga assembler hacker it was common practice to write code to generate a "copper list" (script for the graphics co-processor) during setup of a flashy demo. We didn't stop to reflect that this was a program writing a program in a DSL (the 3-instruction set of the copper chip) but just saw that "aha! that's neater than writing it all out as a constant." I'd bet that if we'd seen code that made a copy of the stack and tucked it away for later restore we'd read and understand it - aha! - without worrying that it's a deep and scary concept called a "continuation".

Isn't it funny that computer science can take ideas that teenagers can understand and make them too complex for most undergraduate programs? :-)

Isn't it funny that

Isn't it funny that computer science can take ideas that teenagers can understand and make them too complex for most undergraduate programs?

Funny?! I thought that complexification of common sense was sort of the whole point...

[edit: yeah, this was an attempt at humor. I probably should have put a smiley somewhere.]

Formalizing intuition

I thought that complexification of common sense was sort of the whole point...

If you used the word "formalization" here instead of the potentially pejorative "complexification", you would be closer to the truth.

Formalizing common sense notions ("intuitions") is a big part of any kind of mathematical theory, but the surprising thing is that you often find that when you take the fuzziness out of the common sense idea and see where that takes you, you discover some not-so-obvious ramifications and gotchas.

And sometimes going through this process leads you to find solutions to problems that on the surface are not examples of the same idea, but on a deeper level are.

If it were only a question of making things complicated, I think everyone would agree it was a waste of time.

the metal will live on

The Dijkstra quote about telescopes can be interpreted in a strong sense, in which actual computers are considered mere tools which happen to be useful in the exploration of the abstract topic of computing. In that strong sense, though, Dijkstra's analogy is flawed, because a significant part of computer science is about the study of machines in general, both abstract and real, whereas there's no part of astronomy which is about the analysis of telescopes, abstract or otherwise. (Computer science is actually quite unique in that way: the tools used in its study are themselves a core subject of the study.)

I imagine that Dijkstra would have acknowledged this flaw in his analogy, and so I interpret his quote differently. Most of the grandparent comment describes my interpretation. I don't see how that interpretation is at odds with the Knuth quote. The point is not "that concrete ideas at lower levels are relatively worthless and might as well be forgotten", but rather that those concrete ideas are instances of more general, abstract ideas.

If we're to move beyond simply using our intuition to do things that we don't even have names for, then there are benefits to inventing or discovering those general ideas, and understanding how they relate to concrete things. As I pointed out in the grandparent, this is harder than just acting on our intuition about concrete things.

As the example goes, dogs successfully act on their intuition, doing calculus in order to retrieve a thrown ball. I'm sure that given the opportunity, Fido would object to learning formal calculus, and point out how funny it is that humans "can take ideas that [dogs] can understand and make them too complex" for even the most mathematically advanced dog. What conclusions should we draw from this canine intransigence?

When I was a 14-year-old Amiga assembler hacker it was common practice to write code to generate a "copper list" (script for the graphics co-processor) during setup of a flashy demo.
...
Isn't it funny that computer science can take ideas that teenagers can understand and make them too complex for most undergraduate programs? :-)

I think I addressed the latter point quite well in the grandparent post. Perhaps if you disagree, you could respond to some of those points.

As for what 14-year-olds can do, some 14-year-olds today understand monads, continuations, typed lambda calculi, category theory, and other abstract PL notions better than I do. In part, I think that's simply a function of the environment in which one grows up. Don't undersell your younger self, whose brain was uncontaminated by a particular concrete model of computation.

Nostalgia

When I was a 14-year-old Amiga assembler hacker it was common practice to write code to generate a "copper list" (script for the graphics co-processor) during setup of a flashy demo.

Apologies, for I'm kind of heading right off topic here. I haven't heard the phrase "copper list" in many years - it really takes me back! Days spent with Fat Angus and friends ...

It makes me wonder though, perhaps I should look back at some of my old Amiga assembler code. I wonder how it looks now that I've, at least to a small extent, started to figure out what programming is about. I bet it would be interesting - I wonder how those floppies are doing?!

[OT] I definitely shouldn't

I had actually saved some of my assembly programs for the amiga, and when I last looked they appeared pretty dull. Let's face it: I didn't even know of the O-notation and was BLITting. Yeah. If I could go back, I would hand my 15 year old self a copy of SICP instead.

C'mon! SICP's great but none

C'mon! SICP's great but none of its Aha!'s are as big as seeing the rasterbars in Interference start rotating. :-)

Would that I could find my

Would that I could find my old Amiga floppies, I wonder what I'd think of the code :-). Hard to guess, I've actually been pleasantly surprised when digging up old Java code that I wrote as a teenager.

But oh happy day! I just ordered copies of The Amiga System Programmer's Guide and the Amiga Hardware Reference Manual from Alibris. I've been searching for these for years after insanely giving them away when I sold my A2000 long ago.

with trepidation

It probably isn't worth even trying, but...

Computation is something can be, and is, done independently of what you call "real hardware"...

The "what" is, but in many cases not the "why." If you want to make an intellectual contribution to computer science, you find yourself faced with a vast array of possible avenues to pursue, and must have a reason to choose some over others. Why go left rather than right? Part of mastering a field is establishing an underlying value system that drives the theory. A pure mathematician may have a sense for elegant or beautiful math; I usually don't, and am left wondering why I should try to prove X rather than Y. A computer scientist may have a sense for what actual computers can actually do, how humans respond to interfaces, or whatever. This motive can justify all sorts of abstractions, but without it you're just pushing symbols.

As in any science, and in programming, the simplest explanations tend to win out, and computer science is no different.

And as in any science, the most esoteric phenomena quickly emerge at the forefront, because all the low-hanging fruit have been picked. Physics has been driven to the very edges, studying quarks and galaxies, but very little in between. It may be fascinating work, but statistical mechanics and Schroedinger's equation are lousy foundations for an understanding of physics.

Because [it's grammatically simple], [lambda calculus is] a good place for anyone interested in learning about computer science to start

If you want to reason formally or automatically about programs, sure, it's great. But it's sterile as a starting point. Without sufficient background in programming or mathematics, it's hard for me to see why a student would care about lambda calculus.

Symbol Slinger

This motive can justify all sorts of abstractions, but without it you're just pushing symbols.

Don't forget: programming is all about pushing symbols ;-)

The biggest deficiency with most professional programmers, in my experience and opinion, is not that they are too abstract-thinking, but that they aren't abstract-thinking enough. This causes them to miss deeper patterns in their code that could simplify it and make it more robust and maintainable.

Understanding abstract, but deep and general, models of computation like the lambda calculus are a prerequisite to calling oneself a truly proficient programmer.

Maybe that's true once you

Maybe that's true once you get to a certain level. My experience with student programmers is that the frustrations are more-or-less evenly split between missing the big picture and having no clue what the machine is doing. Understanding abstract models may be a prerequisite, but no less than understanding the relative speeds of different pieces of hardware, why certain things are fast and others are slow. I think you need a comprehensive understanding ot be a good programmer, but I'm not sure which end is the best to start from.

there is no try

If you want to reason formally or automatically about programs, sure, [lambda calculus is] great. But it's sterile as a starting point. Without sufficient background in programming or mathematics, it's hard for me to see why a student would care about lambda calculus.

I wasn't proposing lambda calculus for students new to programming. My recommendation was aimed at someone who already has a background in programming and wants to better understand PL theory. In that case, why someone would care is that LC makes a good foundation for quite a bit of the rest of PL theory, and it doesn't require a mathematical background. Any programmer capable of learning more than one programming language can quite easily understand lambda calculus, at the level of its interpretation as a prototypical programming language.

That said, kids are taught elementary algebra in high school, and I don't see any reason that lambda calculus couldn't or shouldn't be taught at that level — perhaps in the same class, since both deal with abstracting via names, and have overlapping notions of substitution. The question of why a student would care about lambda calculus applies at least equally to algebra.

And as in any science, the most esoteric phenomena quickly emerge at the forefront, because all the low-hanging fruit have been picked. Physics has been driven to the very edges, studying quarks and galaxies, but very little in between. It may be fascinating work, but statistical mechanics and Schroedinger's equation are lousy foundations for an understanding of physics.

The equivalents of statistical mechanics and Schroedinger's equation would perhaps be something like category theory, or the extreme edges of type theory. Lambda calculus is more like Newtonian mechanics, basic and ubiquitously applicable. I don't think PL theory suffers unusually from having been driven to the edges. There's plenty of material in the middle and at a beginner level, that's accessible to someone willing to study it. But even to be willing to study it requires getting past some of the prejudices that many working programmers tend to have, like whether "academic computer science is intentionally trying to obfuscate things", which was what I originally responded to in this subthread.

Remembrance of things past...

If you want to reason formally or automatically about programs, sure, it's great. But it's sterile as a starting point. Without sufficient background in programming or mathematics, it's hard for me to see why a student would care about lambda calculus.

I'm not sure. Students care about lots of strange things. In my own case, I started programming on an Apple IIGS in Applesoft Basic, although of course "programming" is so charitable as to be an exaggeration. I first saw lambda calculus in The Emperor's New Mind, and while I found the main argument of that book as disagreeable then as I do now, it also introduced me to Turing machines, the Mandelbrot set (and perhaps to fractals in general) and much more. Of course I didn't really "understand" the lambda calculus at that point any more than I was really a programmer. But something captured my attention and stuck in my memory, and although it was many years before I really began to understand and to realize the connections between these subjects, I was interested in them at first independently and for their own sake.

So I guess if this idle reminiscence has a point, it's that you never know what students will care about.

C/Assembler approach carried to its logical conclusion

Satire #1:
Here is how I would explain C++ and OOP to C programmers. A class is a struct with the fields and a table of pointers to virtual methods. A non-virtual method call o.m(x) is translated to the C procedure call m(&o,x). A virtual method call o->m(x) is translated to the lookup and procedure call (o->table[m's index])(o, x). In fact the original C++ implementation, cfront, performs exactly this translation to C.
Satire #0:
Here is how I would explain C/Algol/Fortran procedure calls to microprocessor engineers... (exercise for the reader)

Do we actually call these excellent, user-friendly explanations? Does anyone use them on first-year CS students anymore?

I am sure Luke Gorrie gave his microprocessor-based operational model as a help, and it was good help. But to call it excellent is overstating; it was offered as a first-order approximation: to bootstrap one up, to prepare one for the true explanation.

(There are people who keep saying the cliche "you have to start somewhere". They never say "you have to move on" and indeed they never move on; that is why they are so keen on the starting point.)

A competent programmer must come to know operational semantics sooner or later (as well as denotational and axiomatic semantics). But even then, we all know the harm of locking oneself into one single operational model. Today there are still programmers who believe by faith that all recursions must consume stack space (they haven't even looked at the assembly output of gcc -O2).

Respectfully disagree

I was hoping to explain enough that one could bootstrap into learning to program with continuations. Why second-guess people who ask what call/cc is by thinking what they really want is to understand formal semantics? They probably don't, and nor do they need to just to program productively with continuations.

Actually someone linked to a really good explanation on Keith's blog. It seems to explain continuations in Python in much the same way as Steele and Sussman originally explained them with Lisp and assembler.

Re: C/Assembler approach

A continuation is a data structure of: a stack, a set of CPU register contents, and a program counter.

Taking this approach it's probably best to just come right out and say it; that data structure is, more or less, the same as a thread context. This is why it makes such a good tool to implement multi-threading.

You can even go the other way and make (sub)continuations from threads.

Continuations related to thread schedulers

Wow, that was exactly what I was thinking based on the machine-oriented description.

If you've ever used the (IIRC posix) functions, getcontext() and setcontext() to implement user-level thread scheduling, you'll understand why that makes so much sense.

Threads vs. continuations

You can even go the other way and make (sub)continuations from threads.

One issue with going the other way and making continuations from threads is that a suspended thread typically can't be restarted more than once. I haven't checked, but I assume that's why the "Threads Yield Continuations" paper linked to above only implements one-shot continuations.

Re: Threads vs. continuations

From the abstract:

The techniques used to implement one-shot subcontinuations may be applied directly to other one-shot continuation mechanisms and may be generalized to support multi-shot continuations as well.

Section 4.3 of the paper outlines multi-shot (sub)continuations given a thread-dup primitive. Admittedly, however, it requires "compiler support" to handle copying the stack, which isn't required for one-shot subcontinuations.

Actual implementation of the above

I already have an implementation of exactly what you describe here.

pedagogy

Couple thoughts off the top of my head:

  • Anton didn't mention what I'd think of first as a typical confusion in understanding continuations. Typically people think "reifying the state of control" somehow captures the entire "state of the world" like a database transaction, and recoil in horror at the miraculousness and undoubted inefficiency of such a feature. (I think Luke's explanation helps clarify this.) Most people don't envision a running computer as a tuple <C, rho, sigma>, but rather a homogenous collection of cells in a von Neumann machine. You therefore have to be careful when talking about the "state" of control, since that word carries baggage.
  • Also, I think a lot of confusion arises from the difference between explicit CPS and first-class continuations. If you're looking at it through a compiler-writer's glasses, it's hard to understand how closures and continuations could be the same thing. But if you're dealing with pure LC, then a continuation is clearly nothing but a function.
  • I've found continuations more approachable with simple abstract machines, LC, and the operational definitions of ABORT and CALL/CC that I think came from Felleisen. But as any good teacher knows, there's not usually "one perfect explanation" of anything. The best thing is to try several approaches, since a) different people will take to different explanations, and b) nobody gets it the first time.
  • So you can look at it from the perspective of a nice style for certain kinds of programs that arises naturally in languages with first-class functions, or as a linguistic abstraction to understand various control constructs, or as a mechanism for representing control in an EoPL-style interpreter, or as a "gateway drug" to monads, or as a high-level language feature with various low-level implementations.

Re: pedagogy

Anton didn't mention what I'd think of first as a typical confusion in understanding continuations. Typically people think "reifying the state of control" somehow captures the entire "state of the world" like a database transaction, and recoil in horror at the miraculousness and undoubted inefficiency of such a feature.

Keith's entry on his own blog did exactly that, which is why I drew the connection to closures - because anyone with a decent understanding of closures already knows that they don't capture the entire state of the world. So your point was supposed to be implied, but you're right, it should have been first on my list of "things that tend to mess people up when thinking about continuations".

Terminology

When used informally both "closure" and "continuation" are used to refer to one of three related but slightly different things: a semantic construct (think denotational semantics), an implementation technique (think "compiling with continuations" and "CPS") and a language feature ("first class functions", "reified continuations"). That's two terms each with three possible interpretations. You do the math..

Keith is interested, I think, with using continutations as a programming construct, in the sense of call/cc or letcc. In myu experience the best way to understand this is first to understand the notion of CPS (this helps you answer the question "where do continuations come from"), and then to play with Scheme. Obviously, if he was more interested in the theoretical foundation of this concept, I'd suggest a slightly different route.

Continuations in JavaScript

Here's one more link at explaining continuations, especially continuation-passing-style in JavaScript.

Continuation example with explanation

(define call/cc call-with-current-continuation)

;; call/cc must be passed a procedure p of one argument. call/cc constructs a concrete representation of the the current continuation and passes it to p. The continuation itself is represented by a procedure k^. Each time k^ is applied to a value, it returns the value to the continuation of the call/cc application. This value becomes, in essence, the value of the application of call/cc.

;; If p returns without invoking k^, the value returned by the procedure becomes the value of the application of call/cc.

;; Consider the following example to understand the meaning of the above definition.

(+ (call/cc
(lambda (k^)
(/ (k^ 5) 4)))
8)
;; (lambda (k^) (/ (k^ 5) 4)) is the procedure p of one argument.
;; call/cc constructs a concrete representation of the current continuation (also referred to as escape procedure) and passes it to p.
;; The concrete representation of the current continuation is (lambda^ (v) (+ v 8))
;; Thus, the continuation is represented by a procedure k^ = (lambda^ (v) (+ v 8))
;; Each time k^ is applied to a value, it returns the value to the continuation of the call/cc application. This value becomes, in essence, the value of the application of call/cc. Thus (k^ 5) is evaluated by applying k^ to 5 (the waiting division is thus forgotten since k^ is an escape procedure). This evaluation results in (+ 5 8) = 13.

;; Now, consider the example

(define +8^ 0)
(+ (call/cc
(lambda (k^)
(begin
(set! +8^ k^)
(display "inside body")
5)))
8)

;; Here, the continuation is represented by k^ = (lambda^ (v) (+ v 8))
;; Evaluating the body
; (begin
; (set! +8^ k^)
; (display "inside body")
; 5)))
;; results in 5 after setting the lexical variable +8^ to (lambda^ (v) (+ v 8)) & printing "inside body"
;; 13 is the answer of the entire expression
;; Here, p returns without invoking k^ & the value returned by the procedure p (which is 5 in this case) becomes the value of the application of call/cc.
;; Later, we may invoke
(* (/ (+8^ 35) 0) 100)
;; to evaluate to 43

;; Reference: Daniel Friedman notes

Continuations

At the risk of being accused for self-promotion, I think one might want to take a look at the Io language (not to be confused with this language), as described in this book. It's a pure "continuational" language, with no implicit call stack and a syntax which makes cps the natural (and only) way to use it.

+ 3 4 -> x;
print_int x;
terminate

The sample above shows how + is a three-argument continuation, accepting two numbers and a continuation to which it passes the result, the "-> ;" syntax is merely a 'lambda' ("\ ->" in Haskell), and the separate ";" is similar to Haskell's "$" operator.
The ubiquitous factorial function thus looks like:

declare fac: -> n cont;
  = n 1 (cont n);
  - n 1 -> m;
  fac m -> m;
  * n m;
  cont.

The self promotion part comes here, when I point out the only available implementation, the next version of which I currently happen to be working on. This is not a 'practical' implementation (or really, language), but it works as a vehicle for learning continuations by doing them.

i looked too...

hi, i also searched for the original for that language (after, i think, ehud mentioned it on the old lambda). i found the original implementor (a student of the book's author, iirc), but they either no longer had the code, or didn't want to give me it (it was a long time ago and i can't remember the details). nice to see someone has resurrected it!

constrained gotos

I always thought it convenient to think of continuations as constrained gotos. Rather than being able to jump to an arbitrary place in the code, it can only be to some place with some previously estabilshed context. So in a time travel analogy, continuations only permit you to go back in time, and not forward, or to some place otherwise unreachable (sideways in time?) durring continuation-free execution. This description isnt flawless, but i like to think its pretty easy to swallow.

Relation to function calls?

Always wondered whether function and procedure calls are a special case of continuations? That is, a function call defines a place in the code to return a value to and continue processing once they invoked code is completed.

Four thousand holes in Blackburn, Lancashire

In a sense, the abstract continuation for a function call is everything but the function call. It's more like a runtime instance of the function call site, including the code which surrounds the call site, but excluding the function call itself.

A common way this is expressed in Lisp/Scheme pseudocode is with expressions like (+ [] z), where the brackets represent the place where a function call, or other expression, would go. But the brackets themselves are not the continuation - they're more like the boundaries of the continuation, or the entry point into it. The brackets represent the place where a value is returned to - a hole - but the rest of the code, which accepts the value and does something with it, is also part of the continuation.

Capturing a first-class continuation with an operator like call/cc takes an instance of such a "hole" at runtime and turns it inside out (sort of ;). The result is a first-class value which can be applied - the hole becomes a parameter, so the expression above would be transformed, at least conceptually, into something like (lambda (v) (k (+ v z))), where k is the continuation of the original expression, (+ [] z).

One of the complicating aspects of the overloaded terminology is that an abstract continuation is like the opposite of a value (its dual), whereas a first-class continuation is a value.

Re: Relation to function calls?

Given continuations, functions no longer need to be primitive. A function is just a continuation that accepts an (argument(s), return-type accepting continuation) pair. More easy to read: A -> R ~= Cont (A,Cont R). The assembler analogy (for this case) is pretty immediate: a continuation is a code address and it accepts (via the stack or registers or whatever) it's arguments and a return address. Of course, one can't smash all the information about a continuation into a code address (at least in the naive way) so this analogy falls apart for upward uses of continuations*, but this is exactly the same problem as higher-order functions.

* Nevertheless the analogy is quite useful, just not for understanding general continuations.

parrot

there was a good explanation of continuations related to the parrot virtual machine, iirc - a series of posts on the implementor's blog. you might want to try searching for it (it was posted to the old lambda site).

"What the heck is ..."?

The main Parrot implementor is Dan Sugalski. His weblog is called "Squawks of the Parrot". He talked about continuations here and here. Some of the weblog posts are given special "What the heck is ...?" titles, for various VM implementation topics. See "What the heck is: Continuation Passing Style (CPS)".

What the heck are: heap allocated activation records

Dan is not describing continuation passing style (a direct style compiler for Pascal could choose to generate the code that he describes), and I don't think it's a very good explanation anyway.

CPS can be supported by most languages

Can probably be done in Pascal, though as with most things in that language would require some of the non-standard extensions to get real work done. Not sure that I would disagree that the explanation could be cleaner, but I haven't found one that I can connect with just yet.

Still, the idea is that the CPS can be implemented as record structures to achieve continuations - and for those who understand these things in terms of stack allocated vs. heap allocated, it might help the understanding about how CPS is implemented.

i was referring to the descri

i was referring to the descriptions of continuations, not cps.

for some people (including me) it can help to see the underlying nuts and bolts to a particular implementation of something, even if they aren't necessary to define it.

Will the real CPS please stand up?

I wasn't talking about CPSing Pascal source programs, and neither was Dan. He describes an implementation technique (heap allocated activation records) that he calls CPS.

Really, in machine code, it's all CPS. So saying that one implementation of CPS is the real CPS, and contrasting it with another implementation of CPS seems like it misses the essense of CPS.

You can certainly implement Pascal using Dan's CPS. You don't even need to garbage collect continuations. You can just do single pointer allocation out of a separate continuation heap, and deallocate them when they are applied.

There is actually a CPS translation that translates direct style programs to Dan's CPS. Here is the algorithm: call the frame pointer the "current continuation". Call the stack pointer the "continuation heap allocation pointer". Call the static link the "closure pointer". Call the return address the "code field of a continuation closure". Call the dynamic link the "previous continuation". Call the registers that are live after a call the "free variables of the continuation". Now, you've "CPSed" the program.

So is CPS just about the words we use to call things? Is it just about what implementation of a stack ADT we choose?

Here, sir.

So is CPS just about the words we use to call things? Is it just about what implementation of a stack ADT we choose?

I'd say that a program is in CPS iff you can detect the implementation of continuations in that program's source.

Of course, the source in question can be intermediate stage output, e.g. of a compiler that compiles through CPS.

I agree that Dan's description is too implementation-specific - he describes one possible implementation technique for a particular kind of CPS which can support unrestricted first-class continuations.

what is (call/cc call/cc)?

what does (call/cc call/cc) represent? Can someone explain multiple continuations in simple words?

repeated application

Take a look here.

What (call/cc call/cc) is and isn't

To avoid possible misunderstandings, it's important to understand what the
theorem in the
href=http://pobox.com/~oleg/ftp/Scheme/callcc-calc-page.html>paper
says, and specifically what it does not say. The
theorem says that ((call/cc call/cc) p) and
((lambda (x) (x x)) p) are observationally equivalent in CBV
provided that p is a value. The theorem emphatically does
not say that (call/cc call/cc) and
(lambda (x) (x x)) are observationally equivalent --
because they are generally not. The equivalence holds only in
specific contexts. One of them is
an an application to a value. The lemmas in the paper should make that
clear. Here are a few examples:

(define (foo self)
  (if (procedure? self) (self #f) 
      (begin (display "!") (newline))))

(define (exp1) 
  (let ((f (call/cc call/cc)))
     (f (begin (display "OK") foo))))


(define (exp2) 
  (let ((f (lambda (x) (x x))))
     (f (begin (display "OK") foo))))


(define (exp3) 
  (let ((f (lambda (y) ((call/cc call/cc) y))))
     (f (begin (display "OK") foo))))

Evaluating (exp1) prints OKOK! whereas
evaluating (exp2) and (exp3) prints
just OK!. Clearly, (exp1) can't be
equivalent to (exp2).

That is, eta-expansion is not a sound operation when applied to
(call/cc call/cc). First-class continuations can be tricky. That's why
it helps to be formal.

Those are simple words?

Those are simple words?

Ehud's or Oleg's?

It's a little hard to answer the question "what is (call/cc call/cc)?" without knowing exactly what the problem is.

One thing that could be a problem is that call/cc is being used in a point-free style, as a combinator. That sure confused me when I was learning it! For a while, I had to make myself see only forms like (call/cc (lambda (k) ...)).

Here, eta-expansion does the trick, as they say. Eta-expanding the argument to the first call to call/cc gives

(call/cc
  (lambda (k)
    (call/cc k)))

And eta-expanding the second call gives:

(call/cc
  (lambda (k)
    (call/cc
      (lambda (k^)
        (k k^)))))

Now it's clear, right :) The first application of call/cc captures its continuation, binds it to k, and evaluates (call/cc (lambda (k^) (k k^))) with k bound to that continuation, and with the same continuation as its continuation.

That will capture the same continuation, bind it to k^, and then pass it to the captured continuation.

So (call/cc call/cc) evaluates to its continuation. Notice that the second call/cc isn't necessary to get this effect,

(call/cc (lambda (k) k))

does the same thing (which is one of Oleg's lemmas). Now notice that

((call/cc call/cc) p)

replaces the application of call/cc with its continuation (which is to apply that value to the value of p). Applying that to p has the effect of sending p to the continuation "apply to p", which is Oleg's theorem about self application.

Now you know why ((call/cc call/cc) (call/cc call/cc)) does what it does.

By the way, I prefer letcc to call/cc in most cases:

(define-syntax letcc
  (syntax-rules ()
    ((_ K E) (call/cc (lambda (K) E)))))

then you can have:

(letcc k
  (letcc k^
    (k k^)))

Non-theoretical examples

LtU is great, the modern equivalent of sitting on the steps in Athens and listening to a cast of smart people passing on wisdom. Thanks.
The following possible examples of usage have been buzzing about my head for a while.
If I had a hypothetical gcc that would return control to the 'make' utility after evaluating all of the -Wall -Wthis -Wthat arguments, plus whatever other -I pieces that hold steady across several translation units of the target, would this package of executable compiler plus evaluated arguments form a continuation? Can I then, as long as the compiler arguments remain unchainged, throw it a series of translation unit buffers to compile, blowing by the argument evaluation?
Another example, from a database context. Instead of a long SQL string, when I say "CREATE VIEW ..." with some positional parameters, such that the parsing of the string is effected, and the execution plan is generated, and there are some criteria blanks in the WHERE clause to which I shall bind arguments at run time, is this sort of like a poor-man's continuation?
A third example, from the web. If I make a web page, and a script creates a hyperlink <a href://www.dumb.questions.com?arg1=foo&arg2=bar>Click me for 404 justice</a>, can this be considered a left-handed continuation?
As someone who learns by oscillating between theory and practice, the reasons these examples do or do not qualify as continuations should be instructive. Thanks.

Not too non-theoretical papers

Just in case people decide to read some papers, HOSC Volume 11 Issue 2 was a special issue devoted to continuations.

Including the seminal A Generalization of Jumps and Labels by Peter J. Landin, and an introduction to it by Hayo Thielecke.

I am currently reading pp. 177-208, but it's completely another story...

On Evaluation Contexts, Continuations, and the Rest of the

...Computation.
A reconcilation of two views of continuations.

Continuations are variously understood as representations of the
current evaluation context and as representations of the rest of the
computation, but these understandings contradict each other: plugging
an expression in a context yields a new expression whereas
sending an intermediate result to a continuation yields the final answer.
We show that continuations-as-evaluation-contexts are the
defunctionalized representation of the continuation of a singlestep
reduction function and that continuations-as-the-rest-of-thecomputation
are the continuation of an evaluation function. Furthermore,
we show that defunctionalizing the continuation of an
evaluator gives rise to the same evaluation contexts as in the singlestep
reducer. The only difference is how these evaluation contexts
are interpreted: a ‘plug’ interpretation yields one-step reduction,
whereas a ‘refocus’ interpretation yields evaluation.

Java sandwich

This is how I would write the sandwich example in pseudo-Java (excuse my dialect):

class Continuation {
   Intention currentIntention;
   Location currentLocation;
   Continuation(Intention intention, Location location) {
      currentIntention = intention;
      currentLocation = location;
   }
   void takeOver() {
      for (;;) {
         switch (currentIntention) {
         case haveSandwich:
            Sandwich sandwich = getSandwich(currentLocation);
            eat(sandwich);
            currentIntention = digestFood;
            break;
         ...
         }
      }
   }
}
Continuation backToFridge(currentIntent, currentLoc);
fridge.openDoor();
Sandwich sandwich = new Sandwich();
sandwich.prepare(sandwich.pickIngredients(fridge.contents()));
counter.keep(sandwich);
backToFridge().takeOver();
throw new RunTimeException("Control flow reached past continuation! " + backToFridge);

Is that a continuation?

And: what next?

Not quite right...

I think the Javascript example in the past comments helped me understand something about continuations.

What I did not get quite right is that a continuation normally repeats some code that was executed before-- only in a different context. So the instructions to prepare the sandwich in my Java version should be part of the continuation, and they should first check if there is a sandwich available to eat.

What I still did not understand is the earlier comment that a continuation does not return. Is that to say that it is like a first class function, that can be invoked at a later stage (and then return)?

Continuations don't repeat

Continuations don't repeat code that was executed before. A continuation is just the remaining computation. Imagine a recursive function to sum all of the numbers in the list (in Scheme):

(define (sum numbers)
  (if (null? numbers)
      0
      (+ (car numbers) (sum (cdr numbers))))) 

Here's what the process looks like when you try (sum '(1 2 3)))

(sum '(1 2 3))
(+ 1 (sum '(2 3)))
(+ 1 (+ 2 (sum '(3))))
(+ 1 (+ 2 (+ 3 (sum '()))))
(+ 1 (+ 2 (+ 3 0)))
(+ 1 (+ 2 3))
(+ 1 5)
6

Now in the expression (+ 1 (+ 2 (sum '(3)))), what is the continuation of the term (sum '(3))? The continuation is just the remaining computation, (+ 1 (+ 2 _)). That is, when (sum '(3)) finally returns a value, the value gets plugged in where we have _.

When somebody says that continuations don't return, what they mean is that there is nothing "delimiting" a continuation: the continuation (+ 1 (+ 2 _)) *does* return, but to the top level (this is just a way of thinking about what happens to the value of a Scheme program). It definitely doesn't return its value to (sum '(3)) ... after all, we have a very simple evaluation model where the value of a compound expression is the value of the operator applied to the values of the operands. So imagine the expression (+ 1 (call/cc (lambda (k) (* (k 2) 0))))

(call/cc (lambda (k) )) reifies the current continuation as a procedure and binds it to k in . The current continuation is (+ 1 _), and this continuation returns to the top level (like always). When we evaluate (* (k 2) 0) we first evaluate the operands (in no particular order). (k 2), supplies 2 to the continuation k, which adds 1 and the resulting value is given to the top level. So multiplication never happens, because before it can take place a control transfer occurs.

Briefly reading the "sandwich example", I'd like to make one point. When you invoke the continuation, you are no longer thinking about making sandwiches. You are thinking about what you have left to do i.e. you complete the remainder of the computation. So if you intend to

(begin (call/cc make-a-sandwich) (stop-world-hunger))

the continuation would be (begin _ (stop-world-hunger)).

'Continuations' can 'return'

"Continuations do not return" is a vast oversimplification. Whether it is true or not depends on what you mean by 'continuation' (and 'return'). What do you mean by continuation?

1. In denotational semantics (the first place continuations popped up, think Strachey and Wadsworth), they are used to describe control flow (eg, jumps, exceptions, threads) while still maintaining compositionality (definition by structural induction over programs). In that case, a continuation is some element of a mathematical domain. It doesn't really make sense to speak of it returning or not.

2. By using continuation-passing style (CPS, think Plotkin for CPS and Steele and Sussman for using it in programming practice), programmers can get access to arbitrary control structures (eg, jumps, exceptions, threads) even if they are not provided by the language, at the cost of writing their program in a certain stylized form (and if the language provides closures and proper handling of tail calls). In this sense, the only thing that keeps continuations from returning is proper handling of tail calls and the programmer himself---you can certainly write non-tail calls in CPS code, which is either a bug or a really cool effect.

3. CPS can be used as a compiler intermediate language (think Steele), in which case the continuations represent the label of the next instruction (in the case of expression continuations) or the return address of functions. They don't "return", because when you jump to (or fall through to) a label you don't usually come back, and when a function jumps back to its return address it usually doesn't come back to the instruction after the return statement (though it could, if the source langauge supports backtracking or someting).

4. Using control operators (eg, call/cc, think Friedman and Haynes), programmers can write direct-style (DS) programs that gain access to continuations as first class values. If a first class continuation is captured by call/cc, then it doesn't return (because the CPS transform or continuation semantics of call/cc does not contain non-tail calls). However, if the continuation were captured by the delimited continuation operator shift for instance, then it would return (because that's the point of shift: it corresponds to a non-tail call in CPS).

So, do continuations return? If 1. the question doesn't make sense, 2. only with TCO and if the programmer doesn't write non-tail calls, 3. probably not unless the programming language supports backtracking or something, 4. depends on the control operator.

PS: Hi erik and welcome to LTU :)

Are we talking to ourselves?

I have a meta-comment to make about this question and the answers given.

It is fairly common for new LtU readers to ask "what is X (for some X an advanced PLT topic) and what is it good for".

In response, there seem to be two quite different approaches to handling the question.

Anton's responses in this thread are an example of the first: an attempt to answer with a clear and simplified metaphor that the non-expert can grasp as a starting point to understanding. It is my belief that this is the kind of response that is being sought by the questioner.

But inevitably, there is a second stream of responses that quibble and caveat using a whole bunch of material and considerations that require a more thorough understanding of the theoretical complexities. But obviously, a person able to understand such subtleties wouldn't need to ask the question in the first place.

Those of us who already understand continuations using many different formalisms and from various different contexts don't need the run down and those who don't yet have any intuitive understanding of continuations can't make any sense of so many options, so who are we talking to with such responses?

Of course I think there is a place for such discussions between those who have established their knowledge in those areas and want to explore subtle issues of definition and implication, but I think we do a disservice to newcomers to the field by mixing them into the same threads.

Well, I thought we were talking to each other

There are a zillion and one slightly buggy but more or less in the right ballpark continuation (or monad, or dependent types, or partial evaluation, or whatever) analogies that are quite easily found all over the web.

Are people who read LtU interested inthe zillionth and second, even if its a slight improvement on all the rest? Or are they really interested in understanding the thing for what it is, and not in terms of making sandwiches?

I think this blog loses relevance somewhat if it aims for just another (albeit high quality) analogy provider....

What's this "we" stuff, kemosabe?

Are people who read LtU interested inthe zillionth and second, even if its a slight improvement on all the rest?

Depends on the reader. Some people who haven't grokked them yet might appreciate coming across that next one that rings a bell for them.
Some who have might appreciate a new approach to explain it to others.

I think this blog loses relevance somewhat if it aims for just another (albeit high quality) analogy provider....

If that was all we do, then I would personally agree.

In the spirit of another thing that we do, let me ask you, as someone who understands what you mean by "continuation" in all the contexts that you mentioned, do you think that in all four cases we are really talking about the same thing "in itself"?

Or is the very fact that there are different semantics associated with "continuation" in each context already evidence that there is no one "thing in itself", but rather a massive analogy between different but similar ideas that merely share the same name?

And if the latter, wouldn't this mean that it was meaningless to try to define the word without first limiting it to some particular shared context, implicit or explicit?

In the spirit of another

In the spirit of another thing that we do, let me ask you, as someone who understands what you mean by "continuation" in all the contexts that you mentioned, do you think that in all four cases we are really talking about the same thing "in itself"?

No.

That was the point I was trying to make. jido is struggling with what it means for continuations to not return. It's not clear to me (and possibly not to him either) which continuations he is talking about. We ought to establish that first.

Meeting the audience

It's not clear to me (and possibly not to him either) which continuations he is talking about.

I think it is pretty safe to say that he isn't talking about denotational semantics, delimited continuations or likely even CPS. ;-)

In spite of my previous questions, I'm going to disagree with you and say that all four of your examples of "continuation" DO represent a single, coherent structural notion; the problem you raise is really with the term "return".

The notion of "a closure that doesn't return" pretty much only makes sense to describe a continuation in the Scheme call/cc sense, so it is a fair assumption to interpret it in that context.

Moreover, I would contend that, as a consequence of my notion that the different kinds of continuation CAN be captured by certain structure preserving mappings between domains (in the non-technical sense), getting someone to understand a particular notion of continuations is a first step to getting them to understand the deeper pattern that all these instances share.

All this to say that the "closures that don't return" idea or even the "sandwich" metaphor are pretty good starting places to understanding if interpreted in the right implied context.

Of course there is a lot more to say, but we can't expect everyone to understand all the nuances right out of the gate when we've spent years thinking and reading about them to get where we are today.

You may be right after all,

You may be right after all, Marc. He may be trying to understand call/cc by writing Java code ;)

I would suggest that that's a wrong way to go about things. No amount of Java code, except for an interpreter or compiler for a language with call/cc, will help.

Hatching a Scheme

I would suggest that that's a wrong way to go about things.

On this we are in complete agreement. ;-)

Smoke Signals?

Hey Kevin, nice to meet you ;)

Maybe I should have been more specific in my original post. I assumed that given my examples it would be clear what I meant by continuation (well, maybe not clear to jido, but I still have my hopes up). I did make a large overstatement when I claimed that continuations were not "delimited". For the sake of simplicity, I chose to ignore anything that Danvy would label a bug or "really cool effect".

The terminology argument is a crapshoot really. After all, a continuation doesn't do anything. It's an object in a semantic domain. With respect to computer interpreters, this value is encoded in a particular data structure to be manipulated by the interpreter. Continuations may appear to have particular behaviors, but that's just a misunderstanding on the part of the observer. The interpreter is the actor, a continuation is just a value being acted upon.

That sort of thinking is annoying, and it isn't very useful imho. A sequence is an object in a mathematic domain. When we say that a sequence "diverges", we really mean that the sequence has some particular property, not that it actually performs some action. The use of the verb "diverge" is just a friendly abbreviation. So what's wrong with speaking of a continuation "returning", even in the context of denotational semantics?

Then again, the term "returns" can be confusing, as Marc pointed out. Maybe using it wasn't such a good idea. But jido asked a particular question about continuations "returning", so I was just doing my best to help him understand what I thought he was misunderstanding.

Re: Continuations don't repeat

All right, so continuations don't repeat. Maybe I was thinking in terms of making an entity of the continuation (+ 1 (+ 2 _)), so that you can use it again at a later stage. Is that not the way you use continuations?

---

If I use (+ 1 (call/cc (lambda (k) (k (* 2 0)))) instead of your expression, I will get (+ 1 (* 2 0)) isn't it? So k should always be the last operation to be evaluated in the lambda expression. Everything that comes afterwards is just noise. Or did I misunderstand?

Is it just me or is it

Is it just me or is it kind of evil not to invoke the continuation in a CPS function? It seems a likely source of weird bugs.

For example, I was trying to write a function that counts windows in green houses using CPS. I started with a function that continues passing the number of windows (if the house is green) or stops with value 0. Then I wrote a function that adds up the window counts, stopping if a house is not green.

The first version was (sorry for my poor grasp of Scheme, translation not tested):

(define (greenHouseWindowCount house next)
   (if (eq? (colour house) green)
      (next (length (windows house)))
      0))

(define (totalWindowCount houses next)
   (if (eq? houses [])
      (next 0)
      (** oops incorrect ** (call/cc (lambda (c) (greenHouseWindowCount (car houses)
         (call/cc (lambda (t) (totalWindowCount (cdr houses) **)
            (next (+ c t))
)))))

the original version in dodo syntax:
def totalWindowCount fun(House[] houses -> next)
   if (houses.count = 0)
      next(0)
   else
      greenHouseWindowCount(houses[1]) -> (c)
         totalWindowCount(houses[2+]) -> (t)
            next(c + t)
see comment below for corrected Scheme version.

However, if there is a non-green house in the list it will return 0 since next is not invoked at all. I changed it to:

Note: corrected version after testing with DrScheme
(define (totalWindowCount houses next)
   (if (eq? houses [])
      (next 0)
      (greenHouseWindowCount (car houses) (lambda (c)
            (next (+ c (totalWindowCount (cdr houses) (lambda (t) t))))
))))

Which should give the correct result even with a white house in the list.

The first version was (sorry

The first version was (sorry for my poor grasp of Scheme, translation not tested)

I think you should find a Scheme implementation and test your code.

Scheme call/cc

I misunderstood the meaning of call/cc. My original code did not use it at all. Here is a corrected Scheme version:

(define (greenHouseWindowCount house next)
  (if (eq? (colour house) green)
      (next (windows house))
      0))

(define (totalWindowCount houses next)
  (if (eq? houses ())
      (next 0)
      (greenHouseWindowCount (car houses)
                (lambda(c) (totalWindowCount (cdr houses)
                                     (lambda (t) (next (+ c t)))
               )))))

It does not work because next is not called when there is a white house in the list.

I am a bit disturbed that my program does not use call/cc. Is it to say that I am not using CTS after all? Or can I write the same program with call/cc?

Re: Is it just me or is it

To respond to myself--
You should be able to avoid such confusion with the following rules:

  • A continuation should always be invoked
  • The default "continuation" is id defined as (lambda (a) a)

So the example should be rewritten thus:

(define (greenHouseWindowCount house next escape)
  (if (eq? (colour house) green)
      (next (windows house))
      (escape 0)))

(define (totalWindowCount houses next)
  (if (eq? houses ())
      (next 0)
      (greenHouseWindowCount (car houses)
                (lambda(c) (totalWindowCount (cdr houses)
                                     (lambda (t) (next (+ c t)))))
                (lambda(notgreen) (next 0))  (* calling next is mandatory *)
                )))

Then I get the following execution:

> (totalWindowCount (list `(green . 4) `(white . 3)) id)
4
> (totalWindowCount (list `(green . 4) `(green . 3)) id)
7

Um.

A continuation should always be invoked

It's perfectly legitimate to never call a continuation.

Scenario A: If you're doing explicit CPS and doing I/O for example, you might have a function called write which, given a few parameters and a couple of continuations completed_cont and error_cont. When the write succeeds the write function would call completed_cont, but if there is an error, then the error_cont continuation would be called. One of the continuations is always discarded and never called.

Scenario B: When searching "non-determistically" through a solution space it can be very convenient to use continuations to represent "choice points", i.e. points at which a (non-deterministic) choice is being made. Paul Graham's On Lisp has a very thorough explanation of how this is achieved if you're interested. When a solution is found, there are typically a lot of continuations waiting to be explored (other possible solutions). However, since the solution has already been found, those continuations can be thrown away without ever being "resumed".

Add a "throw-away" continuation

The solution to the two scenarios you described in dodo is to add another continuation that escapes from the current computation (you call it error_cont).

Yes, only one continuation is called-- but I enforce that at least one continuation is called.

Scenario B.

I don't see how you can determine statically that at least one continuation is called when the list of suspended continuations is dynamic. Add to that that the solution space search may not end up actually finding any viable solutions and the stack of stored continuations thus becomes empty. In that case the search "bottoms out" and is forced to (implicitly) just let the current continuation run to completion (to present the user with an error message). (I realize this description may leave something to be desired, but I hope it makes sense if you look at the On Lisp example).
The major points are:

  • It may not be possible to statically determine whether any continuation is called at all (in the search example because it is impossible to determine statically whether the "continuation stack" is non-empty and continuations are only invoked if it is), and
  • It is legitimate for to never end up calling any continuation in any given function invocation; the function just returns normally.

Of course, from another perspective one could also say that it's trivially true that at least one continuation is "called"... namely the current one.

(Posted insanely late/early and I'm partially alseep, so I reserve the right to be utterly wrong :).)

Probably not a good question but ...

Does the use of continuations lead to spaghetti code kind of like the use of "goto" did? If it doesn't, why?

Hold the spaghetti

Does the use of continuations lead to spaghetti code kind of like the use of "goto" did? If it doesn't, why?

In general not. goto was bad in part because it changed the code pointer without changing any other context. This made it very hard to reason about the state of the program across jumps: if the programmer hadn't anticipated things being in that configuration at that target place in the code, bad things would happen.

With continuations, the "place you are going" has its own localized state, and receives an explicit parameter from where it was, so it is much easier to reason about how the program will behave after the jump, without having to know a lot about the state things were in before.

Yet another example of where more localized context makes it easier to reason about a program.

Hide the marinara, too

I agree with Marc. Also, because of the constraints which continuations impose (compared to goto), using them directly for anything non-trivial forces you to think about what you're doing in a way that goto doesn't.

For those sorts of reasons, first-class continuations are best used to implement higher-level constructs, like exceptions, threads, coroutines, backtracking, web frameworks, etc. Those constructs typically encapsulate the use of continuations to make them "safe for human consumption" in ordinary programs.

For the most part, first-class continuations should be seen as a tool-building feature, not something you'd regularly use directly in an ordinary program.

A new dish.

Thanks, both of you, for the clear answers to my questions. I've used escaping continuations before for early exit of procedures. But honestly, I adjusted the code in section 13.2 of "Teach Yourself Scheme in Fixnum Days" by Dorai Sitaram which is here. There are some examples of continuations in that section if anyone is interested.

ordinary use of continuations

For the most part, first-class continuations should be seen as a tool-building feature, not something you'd regularly use directly in an ordinary program.

However, in languages like Raph Levien's Io (mentioned previously on LtU here and here) and in process calculi, plain old sequential code is done via continuation. In Io, a simple syntax innovation for passing a function as a final argument makes continuation passing style look very much like imperative code. And it makes it easier to understand continuations and CPS, IMO.

Clarifying...

True. The usual syntactic sugar for monads also converts to CPS, and also makes CPS look imperative. There are other cases where even direct, unsugared CPS can be quite effective — e.g. parsers, or anywhere where multiple continuations are needed, e.g. success & fail continuations.

In the quoted sentence, I actually meant to say something more like "...call/cc should be seen as a tool-building feature...", because the comparison to gotos had me thinking in terms of faking gotos in ordinary, non-CPS code by using call/cc. IMO, there aren't many direct uses of call/cc that make sense in ordinary programs. Escaping from a scope is one of the few that does, and even that can benefit from a macro.

[re-parented]

re-parented