Common Lisp Exception Handling

I recently watched this Google TechTalk video by Peter Seibel on Common Lisp.

Not being very familiar with Lisp, I found it quite interesting, especially the part about how Lisp handles error conditions. If I understand correctly, rather than unwinding the stack immediately, it sends a signal back through which is picked up at a higher level. Then that higher level gets to tell the lower level how to proceed. This seems neat, since as the speaker points out, you don't lose your state while asking the higher levels how to handle the condition.

Since I have not come across this before, it raised a few questions:

How well does this work in practice and are there any caveats?
Could this have been used in more mainstream languages (Java, c#, etc)?
Do other languages handle exceptions in this way?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Being an Old Lisper...

...I'll take a shot these:

SamK: How well does this work in practice and are there any caveats?

It works extremely well in practice. It's quite common in the Common Lisp world for a programmer to encounter a condition in a running program, fix the problem in the running program without stopping it, and let the program continue on its merry way. Google "araneida slime detachtty" for some clues as to how this works, e.g. to remotely update a Common Lisp webapp being run using the Araneida server while it's live.

Caveats: this requires a rather complex runtime system, doesn't map well to popular hardware architectures, and isn't particularly space-conservative. On the software engineering/social side, working this way requires a fair amount of discipline: when you correct a condition this way, you have to remember to reflect the fix in the source code, you have to worry about versioning the code, there's always the possibility of botching things even worse rather than actually fixing things, and so on.

But features such as this are one reason that Lisp (and, to some extent, Smalltalk) remain the cream of the crop of dynamic languages: once they'd made the commitment to being able to change anything at runtime, they (largely unlike their descendants) committed to being able to change everything at runtime.

SamK: Could this have been used in more mainstream languages (Java, c#, etc)?

Not without radical changes: at the very least, you'd need to reify the stack somehow, and probably couldn't sustain type erasure. Basically, condition-handling in Common Lisp is a kind of extreme reflection—even moreso than what Java gives you, in some ways.

SamK: Do other languages handle exceptions in this way?

A visual programming language called Prograph, because it was a dataflow language, also supported halting execution at any time, changing values at any point, rolling back execution to any point, and continuing from that arbitrary point. I thought it was wonderful.

Not without radical changes:

Not without radical changes: at the very least, you'd need to reify the stack somehow, and probably couldn't sustain type erasure. Basically, condition-handling in Common Lisp is a kind of extreme reflection—even moreso than what Java gives you, in some ways.

You can have type erasure and statically typed resumable exceptions. The key observation is that resumable exceptions are a lot like first-class continuations (ie, call/cc) with the extra restriction that you can throw to them only once. This means that you can type them in a type system with linear continuation variables.

EDIT: I should note that Smalltalk does quite a bit more than just resumption. Not only does it have full continuations, but you can examine the stack frame by frame, which lets you implement AOP as a library. (Whether this is a good idea is another matter, but it's worth being clear about what ST can do.)

I Should Have Known...

...that on LtU, saying "you can type them in a type system with linear continuation variables" wouldn't qualify as "a radical change" relative to Java, C#... :-)

Touche!

Touche!

EDIT: Though the funny thing is that linear types are seeing heavy use among PL researchers doing work on Java/C# like languages. They let you typecheck things like object usage protocols, object states, and so on. We might well see linear types in the next round of commercial language designs....

Implementing Restarts

Do you really need complex machinery like stack reification?

My understanding (ok, maybe flawed) is that a condition handler can just be a closure bound to a global (/special) variable, so that the place where the error occurs can find the handler. It will then call the handler, passing in the error condition (this, I'd say, is like a normal function call, with the usual stack behavior). The handler will do what it takes (maybe invoke the whole interactive debugger) and at some point return with a value that tell the restart what to do (throw, use nil as value, use 0 as value...).

While exceptions (throw) need special stack behavior, I don't think that restarts do in general, as there is no real "longjmp", in C terms.

Debug REPL

I'm not entirely sure if this is the same thing as the uses of resumeable exceptions discussed on this page, but here is some code I've occassionally used in Tcl for debugging exceptions. Basically, instead of throwing an error, it simply calls the debug procedure which starts a REPL at the current point of execution. You can then use Tcl's introspection commands to move up/down the stack and examine/set variables etc. I've found it very useful on occassion. No stack reification, but it does rely on the fact that Tcl allows access to the stack at runtime via [uplevel] and [info level].

proc debug-repl {} {
    set cmd ""
    set level [expr {[info level]-1}]
    set prompt "DEBUG\[$level\]% "
    while 1 {
        puts -nonewline $prompt
        flush stdout
        gets stdin line
        append cmd $line\n
        if {[info complete $cmd]} {
            set code [catch {uplevel #$level $cmd} result]
            if {$code == 0 && [string length $result]} {
                puts stdout $result
            } elseif {$code == 3} {
                # break
                error "aborted debugger"
            } elseif {$code == 4} {
                # continue
                return
            } else {
                puts stderr $result
            }
            set prompt "DEBUG\[$level\]% "
            set cmd ""
        } else {
            set prompt "    "
        }
    }
}
proc up {} { uplevel 2 { debug-repl } }
proc down {} { return -code continue }

And to demonstrate it's use, here is a sample session using it:

% proc foo {args} {
     set a 1
     debug-repl
     puts "Resumed, a = $a, args = $args"
}

% proc bar val {
     foo $val
}

% bar 19
DEBUG[2]% info locals
args a
DEBUG[2]% up
DEBUG[1]% info locals
val
DEBUG[1]% set val
19
DEBUG[1]% down

DEBUG[2]% set a "Hello, World!"
Hello, World!
DEBUG[2]% continue
Resumed, a = Hello, World!, args = 19

You can replace the usual [error] handling capabilities to make this automatic for most errors in Tcl. It's also trivial to make this work over a socket or other channel for remote debugging.

Is this something like the debugging uses of CL resumeable exceptions?

Termination and resumption

The classical case against resumption semantics (the kind that Common Lisp has) can be found in Stroustrup's Design and Evolution of C++; the relevant passage is here. I find the arguments, which are based in experience, rather convincing.

Note that old-fashioned Basics have always provided both termination (ON ERROR GOTO) and resumption (ON ERROR GOSUB) semantics, albeit of a restricted kind; Visual Basic.NET provides full resumption semantics, whereas C# and J# provide termination semantics, showing that the .NET CLR can handle both kinds.

"Only" debugging

Mary Fontana presented similar data from the TI Explorer system where resumption was found to be used for debugging only

That's quite an "only"! I would say that debugging is one of the most important activities in writing software.

In Lisp and Smalltalk debuggers are just normal programs that happen to be invoked at the point of an error and happen to call the stack-inspection functions (etc) to help show the programmer what's going on. It's no big deal to write your own debugger or debugger-like function and hook it in. This is much more elegant than putting the whole debugger into the "runtime system" black box, which typically means writing a rigid one in C.

There is also a big convenience factor for interactive use. For example, suppose your program tries to write its output to a directory that doesn't exist and that the program doesn't handle this condition.
Then it's really nice for the system to see the problem and for a 'debugger' function to offer to create the directory (or switch to another) and then continue. The ability to poke around and patch things up at runtime makes unhandled exceptions quite fluid to deal with.

People have an unfortunate tendency to ignore these basic conveniences and focus on hypothetical software-engineering mumbo-jumbo scenarios. "Oh my, what if Luke installed an exception handler that ROT13 encoded every string on the heap, then how would Jane debug her programs?" This is not the way to illumination.

Timing

Just as I wrote that my Bittorrent client 'handled' an IO Exception by printing "IO Error: [Errno 28] No space left on device" and then getting stuck. This is a case where it would have been much more helpful to leave the exception unhandled and let the system's default handler (a 'debugger' function) offer me some alternatives to aborting, like "Try again now". That's how a Common Lisp version would behave.

Can Erlang be this convenient, too?

By wrapping a process by an observer process, that calls catch and then polls on some port for debugger input, can we get the same convenience? The problem I'd see is that catch only get's post-mortem messages, so stack inspection is not possible. Provided this can be dealt with, the advantages however would be: No abstraction violation (the other point against resumption) and no performance hit, if not wanted. But I know too little of Erlang to tell how well this works. (But I know how well it works with CL ;)

Insurance

The biggest use for me is to simply "know" it's there. I'm right now popping out CL code, which may never be hardened enough to see handlers/restarts. BUT, the last time I was writing in Python (a language I learned before hearing of Common Lisp), I felt like I had to get a good idea of how the error handling would look earlier rather than later.

As this is just an anecdotal observation, it may be entirely wrong... but perhaps someone had similar thoughts.

Termination and resumption in C++

The passage you link to refers to but does not relate the arguments for termination that convinced the C++ committee. I would appreciate more details on the experience referred to there.

TI Lisp guys and exceptions in the C++ standard

I found the experience of the TI guys always dubious. My experience on the Lisp Machine is completely different. I would not want to live without that feature. The concept says that you have all kinds of restarts and error handlers in the dynamic context of your code. On signalling an error, the handler is looked up in the dynamic context. But then you don't 'jump' to the handler but you call it like a function. The handler then decides what it wants to do. For example it can present the current applicable restarts to the user. The operating system and the application can provide lots of useful restarts and the Lisp Machine did.

Typical scenarios:

- the application wants to open a file, but it is not there. Now you have various options like using another file, finding the file somewhere else, creating the file and resuming, just retrying, and so on.

- you have a suddenly a loss of network connection. You can now handle the network connection error with several options

- the compiler detects an error while you are in the midst of compiling a larger system. You could fix the error, go into a debugger, recompile the file, recompile the system, etc.

In many situations you can fix the problem and just retry the operation that signalled the error. This can save so much time.

Berkeley ROC

Dylan also has resumable

Dylan also has resumable exceptions. I wrote about one case where I uesd it on comp.lang.dylan a few years back. It's a very useful feature.

Resumable exceptions in Haskell?

Has anyone implemented a resumable exceptions monad? I have a sketch of one, where the exception type is parameterized by the optional return type.

class Monad m => ErrorM e m | m -> e where
    raise :: e b -> m b
    handle :: m a -> (forall b. e b -> m (Either b a)) -> m a

I suspect there are more elegant solutions. (Perhaps something involving Data.Typeable?)

Transactions?

Looks interesting, could you provide a short example of use?

To me, the most important problem in handling exceptions is to accurately distribute responsibilities between two parties - a potentially failing application code (a worker) and the handler. In many cases it might be benefitial if the handler were unaware of details of the worker, just trying to mend the environment and delegating the actual retry back to the worker. This approach is in a way similar to STM - a transaction (a worker) can 'retry', and the transaction scheduler (a handler) will mend the world for it by waiting until some variable from the read set changes, and will resume by executing the transaction again.

My intuition of Haskell is still very weak, so I cannot "see" whether your solution allows such factoring of responsibilities. Could you comment? Thanks.

Simple implementation of (typed) resumable exc

Somehow resumable exceptions are thought of as complex, especially in
the typed setting. One may think they require a linear type system to
make sure that the captured continuation is invoked only once. In
reality, resumable exceptions are simple, requiring no explicit
capture of delimited continuations. Indeed, throwing an exception and
being resumed by the handler is no different from the invocation of
the handler as a regular function. The only difference is the change
in the dynamic environment during the execution of the handler. If the
exception handler declined to resume the computation, we need a way to
abort it. As Luc Moreau first noticed, both regular and resumable
exceptions can be defined with just the dynamic binding and the
'abort' operator.

Here is the simple implementation of resumable exceptions in
OCaml. The only complication is the desire to maintain polymorphism,
that is, to be able to throw exceptions containing values of
arbitrary types and to be able to resume with values of arbitrary
types (and be able to raise resumable exceptions at many places in the
computations -- in arbitrarily typed contexts). Please see the test at
the very end of the code.


http://pobox.com/~oleg/ftp/ML/resumable.ml

cool!

Indeed, throwing an exception and being resumed by the handler is no different from the invocation of the handler as a regular function.

That's a really clever observation!

Resumable exceptions in C++ could be modelled as...

Resumable exceptions in C++ could be modelled as callbacks. Since the problem is that you can not go back to the previous state of the stack, a resumable exception could be a callback parameter to the routine that throws the exception.

When the exception is thrown, instead of unwinding the stack and then invoking the exception, the exception is executed locally as if it was a callback at the point of throwing, and then execution either resumes or returns to the point that it was caught as it is now.

C++ with Restarts and Resumable Exceptions

If the C++ stack had 'restart points' that it could return to after an exception, it would vastly help for modularity by reducing undesired coupling between policy and mechanism (as described in this paper).

The trick is to provide more than one resumption point (so that policies have an actual choice), to properly handle asynchronous exceptions, to do so in a manner reasonably symmetrical to the exception handling mechanisms, and to do so efficiently - where the static and runtime costs of doing so are, ideally, no more than the costs of the existing exception handling mechanisms.

It's doable! A solution is described below.

  1. The 'exception handler' block executes at the top of the local stack much like a local function. Some special handling is provided to appropriately handle both the dynamic context and the ability to call more functions. This behavior is already common to commercial implementations of C++ exception handling, and so should not be problematic.
  2. The exception handler can 'resume', at which point execution would unwind the stack to the appropriately labeled resumption point that is on a stack frame between the exception handler frame and the signaling event. During stack unwind, any destructors (or 'finally' blocks in Java) are called appropriately. The benefit here is to allow the policy to select a given resumption point that partially unwinds the stack (whereas one currently unwinds all the way to the exception handler).
  3. The resumption point can accept a parameter in much the same way as the exception handler receives an exception. This allows resume-with-parameters.
  4. the exception handler could perform any of its normal behaviors just as it does today, with leaving the exception handler block or returning from the function causing the stack to unwind all the way to the exception handler.

For syntax, it would be necessary to mark certain regions as 'restartable' such that they can be found by the exception handler in the same way the exception handler itself was located. A given restart might be selected under a number of conditions. If parameters are required for the restart, there would need to be a block associated with each restart condition prior to entering the restartable action. For symmetry, one might simply put 'restarts' on the opposite side of the 'try' block from the 'catch' blocks.

In the example, I call the resumable blocks reset, and so the try/catch duo becomes reset/try/catch. In Java, it would become reset/try/catch/finally. The behavior of a reset block is to accept the parameter it receives (if a name was specified), execute any behavior inside the following block, then pass control to the try block. When initially entering the try block, reset blocks are simply skipped.


namespace std { 
  template<typename T> class continue_with{ 
     private: T t; 
     public:  T const& value(); 
  }; 
  class signal : public exception {};
  class div_by_zero : public signal{}; 
  // the above would presumably override 'what()'
  // Assume C++ is modified to throw 'div_by_zero()' 
  //  after a division by zero error.  
}

double divide(double a, double b) {
  reset(std::continue_with<double> const& cwd) { return cwd.value(); }
  try { return a/b; }
}

double randf(double a, double b) {
  try { return a + divide(divide(rand(),RAND_MAX+1),(a-b))); }
  catch(div_by_zero const&) { resume(continue_with<double>(0.0)); }
}

The above is a trivial example, intended more to indicate syntax than anything else. If one called randf(1.0,1.0), one would get 1.0 back because the failed 'divide' throw an exception but then would 'continue with' 0.0.

One thing to note is that the resume action seems to be assuming, without testing, that the given resume type is available. Presumably there would be another exception (perhaps unresumable_exception) if this were not the case! While one could catch 'unresumable_exception' to try a variety of possible resumptions, it would be convenient to have another keyword, resumable(type) to dynamically test whether a given resumption condition is available. To be symmetrical, one might also want to add a catchable(type) to test whether a given type will be caught if thrown.

This whole approach to resumable exceptions also introduces the possibility of very cleanly handling asynchronous exceptions, which would include kill signals, timeouts, thread interrupts or alerts, alarms, etc. Basically, an asynchronous signal would itself execute much like a function at the top of the stack that throws an exception. If the interrupt needs to be fully resumable, it simply needs the correct 'reset' behavior:


class signal_kill { class ignore {}; };

void kill() {
  reset(signal_kill::ignore&) { return; }
  try { throw signal_kill(); }
}
void sure_kill() { throw signal_kill(); }

void black_knight(int hitpoints) {
  try {
    while(hitpoints > 0) {
      sleep(3000);
      // output snarky comments, swing sword
    }
  } catch(signal_kill const&) {
    --hitpoints;
    if((hitponts > 0) && resumable(signal_kill::ignore)) {
      cout << "I fight on! Tis only a flesh wound!" << endl;
      resume(signal_kill::ignore());
    }
    // if code reaches this point then black knight is dead
  }
}

void goodknight(bool bHasExcalibur, int patience, Thread* opponent) {
  // assume Thread.interrupt() will queue an event to execute in the
  // given thread where this event may be a signal 
  // (via use of asynchronous exceptions)
  while(patience-- > 0) {
    sleep(3000); // can only get one good blow in every 3 seconds
    if(bHasExcalibur) { opponent->interrupt(&sure_kill); }
    else { opponent->interrupt(&kill); }
    if(opponent->is_terminated()) {
      cout << "Good night." << endl;
      return;
    }
  }
}  

The above does ignore a few difficulties, such as receiving asynchronous interrupts while handling asynchronous interrupts, and properly catching the results. In this case it would work out well enough, though the black knight might die in the middle of saying "I fight on!" if he lost the last of his hitpoints in another asynchronous call. (Because the exception handler unwinds the stack only after resuming, the same handler would necessarily receive all of the sig_kill calls until the black_knight dies, but would be duplicated on the stack in various states of resumption.)

Anyhow, I seriously think this is workable with and reasonably symmetrical to the existing exception handling capabilities found in languages like C++, Java, and C#. As in, we could have it in an experimental version of C++ and ready for integration with whatever comes after C++0x or the next version of Java.

Enlightening as usual

Thanks guys.

No good alternative to restarts

There is no simple/good alternative to restarts:
* when stack is unwinded
** exception safety is very difficult
** resumption is not easily possible
* when problem is clearly defined like ...NotFound/ReadOnly/Locked and solved how to continue without restarts?
** Which line within the try?
** Transaction-Rollback and Repeat? What about physical side effects?
** Compensation? For the second rocket stage?

Simple resumption behind the signal-statement is not enough, like Stroustrup says.

Callbacks can not access arbitrary implementation details of the "worker" like restarts can.
Callbacks can not be resolved dynamically easily.
Callbacks must be assigned (too) early when the details of the error/condition can not be accessed. With conditions the "worker" can choose the signal with the knowledge of the error/condition.

Closures are nearly the same.

AOP has no sufficient acces to implementation details and is too complex for this task.

Simple resumption is no enough

There is no simple/good alternative to restarts:
* when stack is unwinded
** exception safety is very difficult
** resumption is not easily possible
* when problem is clearly defined like ...NotFound/ReadOnly/Locked and solved how to continue without restarts?
** Which line within the try?
** Transaction-Rollback and Repeat? What about physical side effects?
** Compensation? For the second rocket stage?
Simple resumption behind the signal-statement is not enough, like Stroustrup says.

Can you simply delay unwinding the stack until the exception handler returns?

The programmer can chose explicitely where to retry from. Also side-effects can be explicitely compensated (see D scope statement for a convenient way to do this).

dlw on conditions in CL (and elsewhere)

Dan Weinreb has a tangentially (un)related blog post called What Conditions (Exceptions) are Really About. As he states from the outset, it's not really on topic for this thread: "I'm only talking here about the simple heart of the condition feature, not fancy things like restarts." But you may want to skim it anyway.