## A Modular, Verifiable Exception-Handling Mechanism

A while back I began tinkering with the idea of continuation-carrying exceptions as an approach to divide error handling policy from the mechanism. Of course, I later discovered I was putting old wine into new bottles. Common Lisp follows a similar approach with its Condition/Restart mechanism.

Anyhow, at that time a friend pointed me towards A Modular Verifiable Exception-Handling Mechanism by S.Yemini and D.Berry (1985).

The following varieties of handler responses to an exception can be identified in the literature:

1. Resume the signaller: Do something, then resume the operation where it left off.
2. Terminate the signaller: Do something, then return a substitute result of the required type for the signalling operation; if the operation is not a value returning operation, this reduces to doing something and returning to the construct following the invocation of the operation. This includes using alternative resources, alternative algorithms, and so on.
3. Retry the signaller: Do something, then invoke the signaller again.
4. Propagate the exception: Do something, then allow the invoker of the invoker of the signalling operation to respond to the detection of the exception.
5. Transfer control: Do something, then transfer control to another location in the program. This includes doing something and then terminating a closed construct containing the invocation.

This paper presents a new model for exception handling, called the replacement model. The replacement model, in contrast to other exception-handling proposals, supports all the handler responses of resumption, termination, retry, and exception propagation, within both statements and expressions, in a modular, simple, and uniform fashion. The model can be embedded in any expression-oriented language and can also be adapted to languages which are not expression oriented with almost all the above advantages. This paper presents the syntactic extensions for embedding the replacement model into Algol 68 and its operational semantics.

Without [an exception-handling mechanism], too much information is not hidden and coupling is high. Either the signaler has to be told more about what the invoker is doing, so that the signaler can do what the invoker would want done, or else the invoker has to be given more implementation details so that it can do the exception checking.

Whether the exception-handling mechanism be continuations or something else, I'd really love to see the modern stack-based languages (Java, C++, etc.) implement something much more along these lines. I have been bitten far too often by the high coupling that comes from the inability to separate error handling policy from the exception handling mechanism.

## Comment viewing options

That would be interesting to see in a modern imperative language. It can't be more irritating than Java's checked exceptions.

Some of the design decisions would probably change in a functional language, where individual functions tend to be shorter and higher-order functions are cheap. You can approximate the style fairly closely in Haskell with the multi-prompt continuation monad, using this subset of the operations:


promptP :: (Prompt r a -> CC r a) -> CC r a
abortP :: Prompt r a -> CC r a -> CC r b


Instead of using a new construct for declaring exceptions, we can pass the handlers as functions:


convert :: (Prompt r String -> Int -> CC r Char) -> [Int] -> CC r String
convert badcode codes = promptP $\p -> let conv (i,code) = if validCode code then return (chr code) else badcode p i in mapM conv (zip [0..] codes)  You pass the approprate handler as "badcode". It can return a Char, use the prompt to return directly from convert, or do something else.  -- replace all bad codes with '?' h1 _ _ = return '?' -- runCC (convert h1 [65,-1,66]) => "A?B" -- terminate early, returning "" h2 p _ = abortP p (return "") -- runCC (convert h2 [65,-1,66]) => "" -- replace the code with zero & retry h3 codes p i = let codes' = replaceAt i 0 codes in abortP p (convert (h3 codes') codes') -- runCC ((\c -> convert (h3 c) c) [65,-1,66]) => "A\NULB" -- call out to some other handler foo final ... = promptP$ \f ->
...
let h4 p i = abortP p (final f)
...
convert h4 codes
...


I guess it's not too surprising that you can get this functionality with the multi-prompt, delimited continuation monad, since it's so powerful. (Arguably too powerful; it might be better to hide the prompts from user code.) The part that took me a while explicitly passing the exception handler(s) to the code, which does seem to be the best fit for the mechanism described in the paper.

### I'd like to see it in C++

That would be interesting to see in a modern imperative language.

Actually, I believe it to be quite doable. I wasn't handling 'verification', but I did explore the implementation of the whole restart mechanisms here.

Considering how symmetrical that proposed solution is to the existing systems, and that I can't find any implementation hurdles, I'm more surprised that we don't have it already.

It can't be more irritating than Java's checked exceptions.

That has me seriously laughing out loud. Java developers should know better than to integrate 'new' features in what is intended to be a mainstream language without seeing them tested first.

In any case, checked 'resumption' conditions could be made part of the exception or part of function signatures, but I'm not certain it would be a worthwhile pursuit as checking these things would ultimately limit user-created resumption policies. Actually, I'm against checked exceptions for the same reason: they severely limit the distance for which error handling policy can be decoupled from the code that introduces the error.

### Actually, I'm against

Actually, I'm against checked exceptions for the same reason: they severely limit the distance for which error handling policy can be decoupled from the code that introduces the error.

I don't think that's so much a knock against checked exceptions, as it is a knock against the policy that exceptions can't propagate automatically to enclosing scopes. I think every function should have an effect variable which denotes the exceptions that can be thrown from its execution. This effect variable is a union of all effects of calling child functions (which should be fully inferrable). 'main' should simply have a signature with a nil effect, so no exceptions escape the execution of the program as a whole. Thus we achieve checked exceptions, but without the headaches that Java imposes.

### Ah, yes. That would be

Ah, yes. That would be correct; it isn't the 'checking' that is the problem, but rather both the requirement for manifest declarations of what must be checked and the inability to automatically propagate such checks. In what I wrote above, I was considering "checked exceptions" with the Java design.

Regardless, I'd be tempted to simply guarantee all exceptions are checked by having the 'default case' for exceptions be provided as a standard behavior via the process or thread task, allowing programmers to override these defaults at will.

This effect variable is a union of all effects of calling child functions (which should be fully inferrable).

I have a hard time imagining how this solution would work at module boundaries without sacrificing separate compilation.

On the other hand, I'm not so opposed to sacrificing separate compilation so long as one can at least integrate some pre-compiled forms (with 'pre-compiled' being less restrictive than 'separately compiled' in that it allows one to have a compilation ordering requirement).

'main' should simply have a signature with a nil effect, so no exceptions escape the execution of the program as a whole

There are times it would be quite useful for something like 'main' to propagate exceptions so long as the host knows how to process them.

This is OT, but the whole prescribed notion of 'main' and 'program as a whole' is something I dislike as a language standard concept for other reasons. How a host environment (such as Unix or a shell) interprets a library or dictionary of executable or evaluable code should really be left outside of the language's definition.

### I have a hard time imagining

I have a hard time imagining how this solution would work at module boundaries without sacrificing separate compilation.

Since every function now sports an effect variable, and ML modules can be statically erased, I don't see a problem in principle. First-class modules or modules as first-class values might be more challenging since the module used could change at runtime.

How a host environment (such as Unix or a shell) interprets a library or dictionary of executable or evaluable code should really be left outside of the language's definition.

I disagree. I believe it was you who was arguing that languages and operating systems should be essentially unified, and I agree. In that case, consider a system where the OS is built in a safe language, ala Singularity OS. The user's shell dynamically loads the code for the program, but in order for it to launch the program, the program entry point must have a well-defined signature, such as implementing some Application interface:

interface IApplication {
void Main(string args);
}

The signature could be arbitrary of course, but there has to be something well-defined for the dynamic code loader to typecheck and the shell or OS to invoke.

### Host Environment

The user's shell dynamically loads the code for the program, but in order for it to launch the program, the program entry point must have a well-defined signature, such as implementing some Application interface:

Modulo reflection, I agree. But I'd note it important that it is the the user's shell that defines "the necessary interface". The language designers are not 'prescribing' any particular meaning to 'IApplication' or 'Main'. To the language, those are not special at all.

With reflection - the ability to inspect the 'object' file selected for execution - things become much more interesting. A shell could use 'main' by default, but still allow access to other functions via simple parametric extensions to the name (such as ':foo arg1 arg2 arg3'), with (' args') being the same as (':main args').

Access to type-descriptors (foo : int int -> int) for the arguments could be used to help parse the arguments or even support tab-completions, and could certainly be utilized to support typesafe shell operations and workflows.

Alternatively, a shell could default to 'open ' as the behavior, automatically importing both 'foo' and 'main' and any other exports as commands in the local environment and thus allow one to divide the execution from the environment manipulation. Or both could be possible (in Java, for example, one may use classes without importing them, so long as they use the long name). In this sense, the shell is really an interpreter for a given language that can be extended, and also happens to (in the default configuration) include extensions to help locate commands a user might be interested in executing (such as installed services).

 #myshell> open math; ok #myshell> square 20 400 #myishell> x ## (square x) == 81 x=9 #myshell> play funnymovie.avi executing play:main filename=funnymovie.avi... 

I believe operating systems, shells, and languages should be fully unified. I would not stop at half-measures like forcing everything through a shell's 'IApplication' interface.

### [Off-topic] Windows Powershell

This is probably getting a bit off topic for the thread, but Windows PowerShell might be of interest to you. It's a shell which, amongst other things, allows .NET objects to be piped between commands instead of just text; and which specifies an interface that objects can implement for better integration with the command line (including argument-specific tab-completion).

### But I'd note it important

But I'd note it important that it is the the user's shell that defines "the necessary interface". The language designers are not 'prescribing' any particular meaning to 'IApplication' or 'Main'.

Other than it's the application initialization point, I don't see what kind of meaning there could be.

Access to type-descriptors (foo : int int -> int) for the arguments could be used to help parse the arguments or even support tab-completions, and could certainly be utilized to support typesafe shell operations and workflows.

I don't think reflection adds anything really. One can just as easily create a strongly typed interface for all of the features you're describing, and the application can itself provide the tab-completion.

public interface IApplication {
void Main(string[] args);
string[] InferArg(string fragment);
}

I think concise language constructs, like lambda args and type inference, help more than reflection here.

### At what cost?

Other than it's the application initialization point, I don't see what kind of meaning there could be.

Despite that you can't see it being used any other way, can you offer a good reason why the language designers should prescribe such a notion as 'Main' being an 'application initialization point'? Why should language designers even accept the concept of 'application'? You seem stuck on the idea of reinventing a Unix shell in a language. That is not at all the only option available to you.

I don't think reflection adds anything really.

Reflection buys you the ability to implement the shell as a common language with a shared parser used by all the applications. It also buys you the ability to implement strongly typed workflow languages (like pipes and filters, or service mashups).

One can just as easily create a strongly typed interface for all of the features you're describing

Strongly typed, perhaps. But you're going to be performing runtime typechecking in any shell application because the 'code' isn't produced until runtime. Runtime/dynamic typechecking (weak or not) may be implemented inside 'Main' by reinventing a poor-man's parser for each and every application. Alternatively, you can invent it once and only once (as well as unify syntax for running commands) by having both the typechecking and the parsing duties be moved into the shell.

and the application can itself provide the tab-completion [...] I think concise language constructs, like lambda args and type inference, help more than reflection here.

The approach you propose is considerably less flexible or "strongly typed" than it seems you believe it is, considering that you'll be unable to support composition of commands and functions and you'll still suffer type mismatches whenever you need to parse input parameters (except worse because YOU need to implement the parsing and the parameter validation each time). At the same time you lose efficiency for both the programmer (who must integrate a parameter parser into each application) and the runtime (which will be continuously marshalling and demarshalling structured data as strings).

I'm all for static typing whatever can be typed statically. But providing static typing over commands parsed and introduced at runtime means knowing types. And knowing types of objects that have already been compiled into programs requires the same set of environmental features as to support runtime reflection.

### Despite that you can't see

Reflection buys you the ability to implement the shell as a common language with a shared parser used by all the applications. It also buys you the ability to implement strongly typed workflow languages (like pipes and filters, or service mashups).

Reflection buys you far more dangers than benefits, and I used to be firmly in the other camp. Of course, this all depends on the scope of the reflective capabilities, whether it includes abstraction-breaking introspection, etc.

And reflection doesn't buy you any workflow or filtering capabilities. Those are readily implemented by higher-order functions and/or objects, as evidenced by any ML, none of which have reflection.

I think what you really want is to embed the parser and a language interpreter, ala "typing dynamic typing". So you basically want a REPL shell.

But you're going to be performing runtime typechecking in any shell application because the 'code' isn't produced until runtime.

Not necessarily. Code isn't loaded until runtime, which is very different from saying that it's produced/generated at runtime. Either way, a program must provide a type signature for the code it is attempting to load:

val load_code: 'a Sig.t → in_channel → 'a Rep.t option

Rep.t is a 'representation type' often appearing in polytypic programming literature (which is a safer form of reflection). For simplicity, I'm assuming the Sig.t defines the structure of the expected type.

Now the user's shell is a text-based interface, so we can go two routes here: a full blown REPL shell, or a smaller standardized interface language for constructing and invoking subsystems.

You also seem to imply that my original suggestion required the entire program to be loaded into memory, but really, only a small launcher component need be instantiated for tab-completion and initialization. A language-based OS is a sea of objects, not heavy monolithic processes like current OSs, so when I say IApplication, I mean literally a single object, not all the objects instantiated when Main is invoked.

[Edit: note that, even with the REPL shell option, the code must still export either a standardized symbol table, or a standard entry point which is executed on code load, which loads all symbols into a global table. There always a standard interface somewhere! :-)]

### Reflection

And reflection doesn't buy you any workflow or filtering capabilities. Those are readily implemented by higher-order functions and/or objects, as evidenced by any ML, none of which have reflection.

'IApplication' is "a smaller standardized interface language for constructing and invoking subsystems." It is not ML. It does not have ML's capabilities.

By using 'IApplication', the shell program cannot determine the 'real' types of the inputs to the published applications. That is, when a user types in a 'wrong' string, the shell can't issue any warnings or do anything similar. The shell certainly cannot typecheck its workflows or pass complex objects as inputs to 'application' methods.

I think what you really want is to embed the parser and a language interpreter, ala "typing dynamic typing". So you basically want a REPL shell.

Essentially, yes. But it doesn't need to be a REPL shell; the HCI is more open, and could be an Object Browser akin to Squeak, ToonTalk, or Second Life. But a REPL browser would be a convenient version.

And the typing doesn't need to be dynamic. But static typing of each shell command, as it arrives, requires the language runtime maintain all the information one would usually associate with static type reflection.

The ability to create alternative shells that were not strictly REPL loops would certainly benefit from the ability to query for the appropriate metadata, implying reflection.

Reflection buys you far more dangers than benefits

Eh, well, I'll agree it can cause problems. But the issue is never reflection by itself, but rather reflection in the context of some assumptions used in the development of other language features. And I'm willing to trade some of those assumptions in order to buy reflection without the dangers. Besides, most of the assumptions 'problematic' when combined with reflection (such as use of abstraction as a mechanism for encapsulation, and the use of encapsulation as a mechanism for security, and the use of named types with inheritance) are also problematic for other language features that may be nice to have in an OS-integrated language (including distribution or security).

Code isn't loaded until runtime, which is very different from saying that it's produced/generated at runtime.

I literally meant produced/generated at runtime. But you might be confused: I'm talking about the shell code - the stuff the user types dutifully into the REPL interpreter or whatever. The shell code is among the stuff that, if "the language and shell were integrated" could be typecheked, abstractable, capable of producing and handling exceptions, etc.

You also seem to imply that my original suggestion required the entire program to be loaded into memory, but really, only a small launcher component need be instantiated for tab-completion and initialization. A language-based OS is a sea of objects, not heavy monolithic processes like current OSs, so when I say IApplication, I mean literally a single object, not all the objects instantiated when Main is invoked.

What you say here confirms what I already understood of your claims. I don't take issue with the above. Where I have a problem is you presenting 'IApplication' as a solution then pointing at 'ML' when attempting to explain the properties of your solution.

### REPL shells for statically typed langauges

The link I provided earlier to tagless interpreters and "typing dynamic typing" can be used to achieve what you're looking for. A language's standard library can provide a module which is a self-interpreter for the language (or a JIT preferably), and standard library can provide another module for the language's parser. Constructing a REPL shell from this is somewhat trivial. I think all languages should be built in this metacircular fashion in fact; metacircular interpreters for statically typed languages FTW! :-)

### Restartable exceptions

... are also present in some Smalltalks. There it is cast in an object-oriented form which may be more amenable to implementations for other OO languages.