Why are exceptions not described as 'purely functional'?

Say we have the expression -

(/ 1 0)

In my purely functional programming language (with strict evaluation), I currently handle the evaluation by returning a 'violation' value that describes the problem like so -

(violation :v/contract/divByZero "Division by zero." ())

This ‘preserves’ referential transparency for all expressions by making all them replaceable by their value, if only by a violation. But this different than other purely functional languages I've seen.

Whenever I read a description of the properties of impure functional languages (such as ML), exceptions are described under the imperative section as an imperative feature. I find this a little odd as I'm not sure how a purely functional language would handle a division by zero error without exceptions (or without violations like I use).

Haskell has exceptions, but they can only be handled in the IO context. This makes sense to me because Haskell is lazy. My language is strict, so I don't see why I would be violating purity by having exceptions. After all, what makes a language purely functional is the pervasiveness of referential transparency (replacement of any expression with its resulting value). But why would we care about preserving referential transparency for the set of expressions that have no possible value in the first place? Would the replacement of violations in my language with exceptions technically render it 'impure'?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I do have "classical"

I do have "classical" exceptions in my language (http://elalang.net) which is also strict by default and the language itself is basically pure (side effects can be done only through foreign functions).

I am not sure that there is the only one "right" answer to this question.
Probably we can try to generalize this question. Exceptions basically do a goto from one function to another and because of that they are somewhat close to coroutines. So we can ask if an a presence of coroutines makes language impure (e.g. an ability to "jump" from one function to another).

By the way in D language one there is a "pure" annotation that one can put one a function (and compiler ensures that this function is pure). However it is completely legal to raise exceptions from such functions.

Is it legal to handle

Is it legal to catch exceptions in D's pure code? We can raise exceptions anywhere in Haskell, but can handle them only in the IO context.

As I understand - yes.

As I understand - yes. However it doesn't prove anything. This is just an example of a language designer decision.

Because if you have

Because if you have exceptions that go up the call stack then you can no longer use the same value based reasoning rules. For example:

let y = f x in x+1

In a purely functional language, this is equivalent to:

x+1

But if f raises an exception, this equivalence no longer holds.

I have to ask again, tho, why

I have to ask again, tho, why would we care about preserving referential transparency for the set of expressions that have no possible value in the first place?

Any invalid expression has no value, and therefore has no definition for equivalence between the invalid expression and any value. Thus that equivalence cannot be held in the first place, therefore no change in semantics seems to occur when introducing exceptions into strict, purely functional language.

Am I off in this reasoning?

Oh, wait, let me look at

Oh, wait, let me look at your example again... I think I missed your point.

As soon as you can catch

As soon as you can catch failures, you can turn an exception-raising expression into a value.

let test f g =
  try (raise A, raise B)
  with A -> (1,0) | B -> (0,1)

This code allows you to observe evaluation order; one of the rules (f(), g()) = (let x = f() in (x, g())) or (f(), g()) = (let x = g() in (f(), x)) does not hold.

(Note that the previous example of Jules also allows to observe non-termination: case f x of (_, _) -> x + 1 is not equivalent to x + 1.)

Yes, and in the hope it's

Yes, and in the hope it's relevant to the OP's post and could help, I for one like the example of try/catch desugaring given on this Matt Might's blog post, which captures the idea I think (from the CPS angle):

"Implementing exceptions in CPS" section (in JavaScript)

http://matt.might.net/articles/by-example-continuation-passing-style/

In essence, once the sub-problem of generalizing the transform from { JS exception semantics + partial functions } to { "exceptional" continuations + total functions } is solved practically enough(*), I think that's also what dmbarbour refers to as "Pass a Handler / Error Counseling" in the LtU thread he already pointed to.

I do like the Pass a Handler / Error Counseling pattern, too, if only out of empirical experience (most often positive).

'HTH

(*) without ever trying to generalize it, I did find myself reinventing that one a couple times in yet another context, C#, even before reading Matt's post. I've always wondered, btw: what does that tell us on our natural, instinctive capabilities to workaround our "favorite" languages' weaknesses by rather mysterious means of finding better solutions ? ;)

In a purely functional


In a purely functional language...

I have an impression that you talking about lazy functional language in the first place, not just pure. In a language with a eager evaluation an equatation like this:

let y = f x in x+1

can be broken just because evaluation of "f", "x" or "f x" results in a _|_.

In lazy language however an equatation can be broken by explicit sequencing:

b = c `seq` 42 --What if c is bottom?

Indeed, non-termination is

Indeed, non-termination is an effect as well. With non-catcheable exceptions (like the OP seems to be talking about?) the reasoning rules that are no longer valid are the same ones that are no longer valid if you have non-termination (because an exception that you can't catch behaves the same as a non-terminating computation). If you can catch exceptions then even fewer reasoning rules remain valid (see gasche's response above).

Maybe Just Either Left, Right or Nothing

If you return failure as a value, e.g. with an Either type (Left x | Right error) or Maybe type (Just x | Nothing), then you still have referential transparency.

However, raising exceptions - i.e. jumping to a handler - is not the same as returning errors as a value. This violates RT because it calls a continuation that depends on context and was not an explicit argument to the function.

I consider exceptions in Haskell to be a wart on the language. But modeling them explicitly (i.e. Control.Monad.Error) isn't so bad.

However, raising exceptions

However, raising exceptions - i.e. jumping to a handler - is not the same as returning errors as a value. This violates RT because it calls a continuation that depends on context and was not an explicit argument to the function.

I am not sure that it is that black & white in reality.
Let's say that we have a division function that reports about an error using a "sum type":

let x `div` 0 = None
    x `div` y = Some (x / y)

This is how you can analyze it's result:

match x `div` y with
      Some x = x
      None = 0

This is perfectly valid and doesn't violate RT, right? A function 'div' always returns None if it's second argument is 0. You can however 'analyze' this result and effectively transform it into something else. For example, you can write code like so:

match x `div` y with
      Some x = Foo x
      None = Bar

The fact that this expression evaluates to a different value than the previous one doesn't make 'div' function dependent from the context, right?

OK, and now let's deal with the "regular" division function that can raise an exception:

let x `div` 0 = fail "Divizion by zero."
    x `div` y = x / y

The result of this function seems to depend only from it's arguments as well. If the second argument is 0 than you will always get an exception as a result. You can however capture this result using a construction like this:

try x `div` y with
    e = ...

It looks quite similar to what was used before. I am not sure that this 'try' block actually changes the result of 'div' function - it's result still depends from it's arguments only. The 'try' block seems just to transform the result into some other value. What can prevent us from considering this exception object as a special case of a return value that can be matched using a specific construct?

To explain the meaning of

To explain the meaning of "fail", you have to change your language semantics: you go from a world where each program subexpression evaluated to a value (and meaning of an expression was composed from the meaning of its subexpressions) to a world where each program subexpression evaluates to either a value, or some form of failure (and meaning of an expression is composed from the meaning of subexpressions, considering value *or* failure for each).

You can give a compositional semantics of any language feature in this way, if you are allowed to change the language semantics (eg. GOTO can be explained in a semantical way). That's what operational semantics do.

On the other hand, you may want to stay in the world where you try to explain everything with values only. This is done by changing `div` to work with options; but to do that you have to change your program globally, or at least non-locally: plus x (div y z) must now be written differently, because the type don't match anymore (`plus` expects an `Int` as right-hand-side, not `Maybe Int`). This is a monadic translation. You have done pragmatically what Moggi did theoretically when he explained how to give a semantics to effects as a global program-to-program rewriting step towards a pure calculus.

(Another translation, instead of wrapping all may-fail computation in a sum type, is to abstract them over an error handler; what you will get is exactly a CPS translation, also monadic in nature)

May be this problem actually

May be this problem actually needs a more pragmatic approach. Programming languages are not just calculation models but are intended to solve problems, to be practical.

For example, from a practical point of view, changing a 'div' to work with options in a dynamic language is not really an option. Mainly because all operations in a dynamic language would have to work with options - as soon as all operations, including 'x + y', can fail at run-time with a type error.

If you take a strict language the situation is quite different from a lazy language as well. More to say I believe a strict language cannot be explained with values only - it's semantics is already different. One can even doubt that a strict language can actually claim to have an RT when an equatation like:

b = snd (a,b)

can effectively fail.
Even without exceptions we already have "value or failure" semantics here. By introducing non-catchable exceptions we change nothing here. And with a 'catch' construct for exceptions we are writing code which seems for me quite monadic in nature. We are basically encapsulating GOTO in some sequential calculation and this whole calculation is perfectly transparent, e.g.:

let x `div` y = try x / y with
                    DivizionByZero = 0

The 'div' function here is pure, one can use it in expressions and these expression will be referentially transparent. But what happens inside try/with is some sort of "magic" that really resembles of an IO monad "magic".

In fact I doubt that there

In fact I doubt that there maybe a turing complete language where everything can be explained with value only. Because _|_ is not a value.

Indeed, "everything

Indeed, "everything evaluates to a value" is a programmer view of what logicians call "cut elimination", which implies strong termination and, generally, consistency of the underlying logic. Turing-complete languages have non-termination and are therefore inconsistent logically.

That said, Turing completeness is not a particularly desirable property for a language; if you can do anything you want to do in a non-Turing-complete language, you're fine. In particular, you may speak about non-terminating programs, by capturing non-termination in a monad or some other form of effect.

Of course, effectful languages allowing non-termination, such as Haskell, are also fine. Regarding Haskell again, it's funny that having lazy evaluation as its core actually led people to a very casual approach to non-termination; for example Haskellers are generally quite proud that their language "allows recursion everywhere by default", because they consider bottom usual business, while in comparison the community of strict ML languages is much more averse to general value or module recursion, posing sometimes annoying static restrictions to ensure they're well-defined.

Excellent points in your

Excellent points in your second and third paragraphs that I can relate to, very much so.

Coincidentally, my spare time personal researches involve modeling the embeddings and inclusions of languages phrases within each other and to allow those languages identities be optionally annotated with claims re: their semantic properties, such as stricteness vs. laziness of their resolvable implementations, in a meta language having the flavor of a simply typed lambda calculus. By resolvable implementations I mean processors as resources, by identities I mean URIs, and by annotations I mean the so called "base types" of the latter.

I am still trying to figure out about a useful set of type kinds and their accompanying constructors (their arity, etc) but my aim is obviously to make profit of the always terminating type checking and type inference property of HM(X) to deduce the important properties of interest of the outermost phrase.

This idea of looking at languages as first class "values" came to me after that oft-recurring observation you made re: the tensions that exist, de facto, between various languages design choices and their implementation aspects, and what it induces over (read: against) language interoperability.

(as I am of the "church"* of the non-believers in the "one language to rule them all" sort of motto, as you probably guessed)

*just a metaphore!

Not exceptions but conditions

The Lisp languages have different terminology for what happens in an exceptional situation (an exception) and an object whose value reports the problem (a condition). What you have here is that (/ 1 0) returns a condition directly without raising an exception.

Then if you require that if a function is passed a condition and doesn't advertise (perhaps by its signature in a statically typed language, by other means in a dynamically typed one) that it can handle conditions, it immediately returns the condition without further ado: a simple form of polymorphism. That is pure and functional and gets you what I think you want: you can even use setjmp-style exceptions under the table to implement it.

It's been a while since I

It's been a while since I looked at Lisp, and forgot about its condition system! It's pretty cool as far as an imperative error handling system goes, but I'm not sure if a pure functional language could utilize the sophistication of such a system. Returning a violation object would seem simpler and strictly no less powerful. Perhaps I've overlooked something?

Conditions with pure functions

Conditions with would be easy enough to express with continuation passing, even in a purely functional language. Similarly, resumable exceptions. But continuation passing is a bit painful to model by hand. We could use a continuation monad - still pure and deterministic, but the embedded language of a monad is generally not RT.

If you're looking for a summary of error handling patterns, I've linked you to one.

You don't even need full

You don't even need full continuations; dynamic scoping + exceptions (or state + exceptions) is enough to macro express resumable conditions.

I'm assuming you don't have

I'm assuming you don't have "dynamic scoping" or "exceptions" in a pure functional subset of the language. Neither are RT.

Well, continuations aren't

Well, continuations aren't either. Dynamic scoping and exceptions can however be encoded via monads in a pure functional language, just like continuations. But I was making the observation because many people think that restartable exceptions are something special that needs exotic features like setjmp (see above) or continuations, whereas in fact they can easily be simulated on top of normal exceptions in an imperative language. Resumable/restartable is a bit of a misnomer: they don't restart the computation by rebuilding the stack to the point where the exception was raised, but rather they are callbacks that get invoked *before* stack unrolling has begun. The only trouble is then getting a callback from an exception handler to the exception throwing expression, hence the need for dynamic scoping (which itself can be built on top of state).

Continuations

In a pure, referentially transparent language with tail calls, you can easily pass continuations around - i.e. CPS by hand. What you can't do is capture the current continuation.

I agree that resumable exceptions are easy (just don't unwind, yet).

It might be interesting to

It might be interesting to see how one can "re-think" these patterns for a dynamically typed language (which is what topic starter is talking about as I understand) - e.g. a "real" sum type is not a possibility in a dynamic environment, a single expression may have different type depending on the run-time values, etc.

?

Why would that be? Sum types are useful for type-checking, but have a direct dynamic semantics that works just as well in a dynamic language: wrap the value in a block tagged with the choice taken (`Left`/`Right` or something else), and then match over this tag to know later where the (now boxed) value came from. Other patterns in the list pointed to work just as well in languages with no type-checking (well, minus the usual benefits of static checking, which can for example ensure that you checked all error conditions).

Sum types in a dynamic

Sum types in a dynamic languages == tags. A good example of such an approach are polymorphic variants. With variants you basically have constructors that are created "on a fly" and your whole program have just a single open algebraic type.

In a statically typed language using sum type is different from a dynamic language:

last (x:[]) = Some x
last (_:xs) = Some (last xs)
last []     = None

Here the 'last' function returns a value of type Option. In a dynamic language one can write:

last (x:[]) = x
last (_:xs) = last xs
last []     = None

Here the return type depends on the value of an argument. The problem is that one can simply ignore the fact that "something bad have happened" and pass an errorneous value further.

With such an approach a sum type doesn't really have much difference from a product type. Instead of tagging one can simply return a special "error" object that would explode immediately once you try to do anything with except of matching it using a specific "error check" pattern.

Unnecessary complication

In a dynamic language you *can* write the example as you demonstrated, but it is a bad idea in the general case. The first code snippet works just as well, and is much more robust, because it forces the receiver to acknowledge the necessary error-handling logic, and because it rules out a lot of not-fun-at-all issues with "what if the value of the last element is None" and friends.

There are error handling techniques, both in the typed and untyped world, that allow you to silently ignore the failure condition and continue with the value in any case (eg. Go's pattern of returning a product of a potentially meaningless value and an error-indicated boolean), but they have very different characteristics as error-handling systems. Certainly they might be a better choice in some situation, but it is not true in general that you second code "corresponds to" an untyped version of the first one; it is a different code with very different properties. The direct untyped counterpart of the first code example is precisely the first code example.

You can deal statically with untagged, non-disjoint unions if you want to. In the ML world this is done through GADTs or some form dependent types; if you want to use them in less structured ways, as is idiomatic in some untyped languages, you need sophisticated fine-grained static reasoning, such as the predicate typing of Typed Racket.

Yes, in a dynamic language

Yes, in a dynamic language you *can* write the code as I demonstrated and I believe such a possibility (obviously missing in a static language) is already a problem.

A "Go" way (as I understand) uses C-style error handling technique. In this case you can at least see (based on a function signature) that this function can fail and return an error code. In my "dynamic" example there is no way to see it. I believe it might be a problem. With a so called structured exception handling there is no way you can accidentally ignore an exception.

On another hand an ability to return different type depending on run-time values is something that gives more flexibility. Using sum types for error reporting might not be always convinient. We can look at the code sample from topic starter:

x / 0

If we are following "errors through sum type" paradigm than a function like this can't just return a numeric, it will always return a sum type which makes calculations involving division pretty cumbersome.

Not quite imperative

Lisp exception handlers are called in the context of whoever raises them, so they're functional in themselves. On the other hand, deciding which handler to call is done by checking a hidden dynamic variable, so that's not functional. On the gripping hand, the handler must decide whether to return via a captured continuation (non-functional) or attempt to return a corrected value to the caller (functional). So it's a baroque tangle of pure and impure semantics.

Returning a condition object plus semantics that say "By default, procedures passed a condition return their argument" is much simpler and clearer.

I like the simplicity of the

I like the simplicity of the latter. However, having to check each argument for a condition would really slow down execution.

(Well, perhaps this is not true... Could branch prediction negate most of this cost if the likely condition is made to be a lack of a condition parameter?)

Assuming branch prediction is not going to save me, the nice thing about returning a corrected value to the caller is that much less checking needs to be done at run-time. Since my language is ultimately aimed for soft real time simulation development, this might be preferable even if I have to take on some imperative semantics...

I think you mean "By

I think you mean "By default, procedures that receive a condition from a procedure they called return this condition". Indeed, this is basically what programmers manually do with return codes in C. Actually you can implement delimited continuations efficiently when compiling to languages that don't have delimited continuations with this strategy: instead of just returning what you received, you cons the local variables and the instruction pointer onto the value that you received and then you return it.

Exceptions are purely functional

The comment about exposing evaluation order vs. laziness is the key to this discussion. The argument is that if f(x) raises an exception, and you substitute f(x) for all instances of y in this procedure, then something else may get evaluated first. So the return value is now dependent on the order of evaluation that is embedded in the redex function of the interpreter. The same is true for assignments in impure languages.

But there is another way to look at exceptions. We can view all functions as returning an [open] union type. The "OK" result is the one that we declare as the return type. So a function either returns (OK, v:ReturnType), where v is some value, or it returns an exception instance. We then rewrite each return site to perform a suitable union dispatch and either propagate the exception or proceed forward in the computation.

Once this rewrite is performed, both substitution and lazy evaluation operate as expected, and the language is once again pure.

So the real problem with exceptions is that they effectively force us to [re]write programs in a way that defeats lazy evaluation.

Lazy evaluation _is_

Lazy evaluation _is_ a rewrite order... So I would assume laziness to being orthogonal to whether a language can support exceptions.

(In essence, I would expect that the difference between a language with strict order evaluation, and lazy evaluation, both with exceptions, only should boil down to when, and where, and if, exceptions are thrown. It doesn't make it impossible for lazy languages to have exceptions. The referential transparency is the key, not the laziness.)

You only break RT if you observe the exception value and change

control flow.

That said, I believe the OP is looking for A semantics for imprecise exceptions. I do not recommend skimming through this paper, but rather reading through it, as it covers some historical perspective on how Haskell exception handling developed, such as:

The idea of exceptions as values is very old [10, 18]. Subsequently it was realized that the exception type constructor, ExVal, forms a monad [6, 9]. Rather than having lots of ad hoc pattern matches on OK and Bad, standard monadic machinery, such as Haskell's do notation, can hide away much of the plumbing.

In the following section, they sketch very good design arguments for why simply using a union type may still break referential transparency, due to the fact the programmer now has to carefully and precisely manually manage sanity checks. In other words, although you are technically correct, in practice you may be wrong. ;-)

Well, uhm, I might not understand higher-order logic anymore..

But if the solution implemented in Haskell is the proposal from that paper, my own snailingly-slow compiler does it a bit better. Uhm, ;-P.

The thing is that it is easy to implement a CPS-like transform for exceptions which supplies an exception handler to all functions. That manner, you don't have to take into account the exception value everywhere; i.e., you don't need to generate value/exception handling code in the body of all functions. You just apply the exception handler, i.e., change the flow of control, wherever an exception occurs; a 'throw' gets replaced with the application of an abstract event handler.

It's also quite reasonable to assume that a CPS-like transform should be implementable, since you're effectively dealing with flow of control changes. Like you stated.

A CPS-like transform like I did should be implementable in Haskell GHC too, and should be a bit faster. (But they probably already did so, the paper is probably dated; but I wouldn't know.)

This paper is for Control.Exception

This paper is for Control.Exception. I am honestly not that deeply familiar with Haskell, so maybe there are other approaches I don't know about.

Your observation about transforms for exceptions is correct, and it is the same logic that can be applied to resumable exceptions. In this way, there is no type paraphernalia required to ensure linearity across continuation calls to prevent resuming from the same processing context more than once.

A different approach to exceptions

If anyone is interested to know, there is a language that takes a different approach to the problem you are discussing. It's a purely functional, strict, statically typed language that doesn't have exceptions. Instead, it has a concept of computational success/failure. This means that any computation can either return a value/values (of the type(s) that is/are inferred statically) or fail (in which case the return value(s) are undefined). The language defines a special 'statement logic' that guarantees that, on one hand, the undefined values can never be used, and on the other hand, the failure of a computation doesn't cause the whole program to fail - kind of like exception handling, but in a more systematic manner. Does anyone care to know what I mean?

That sounds a lot like Aha!,

That sounds a lot like Aha!, which was recently discussed here.

You are right

You are right, I mean Aha!. But first I'd like to ask everyone a question: what IS an exception? Why do we have to include this concept in every modern language - because of the 'fashion', or is there the actual need for them?
Second question: do you think that all 'exceptions' are the same, or there is a fundamental difference between, say, an I/O error and a division by zero (or another 'developer' exception)? In my opinion, there is a HUGE difference between them, and they should be treated differently. Especially in a functional language.

No Exceptions

Exceptions don't add much but complexity. Exceptions conflict or interact awkwardly with many programming models: parallelism, concurrency, laziness or promises, reactive programming, backtracking searches, language security, pipelining, generic programming, and modularity.

The alleged advantage of exceptions is to keep the "happy path" clean. It's a dubious claim in the presence of checked exceptions.

The symptom exceptions mask is a consequence of `if` conditionals not being directly composable or extensible. We could address the problem at its source. For composable condition structures, consider Multiple Returns or Arrow Choice.

Four years ago, I was studying exceptions quite intently. Today I advocate against building exceptions into the language. If developers model exceptions explicitly where they feel they need them, there is at least more flexibility in which tradeoffs are made.

do you think that all 'exceptions' are the same

No. I distinguish between errors that cause abstraction violation (such as "out of memory", in languages with implicit authority to allocate) vs. domain errors that can be handled (such as "file not found" or div-by-zero).

Disagree

Exceptions don't add much but complexity.

Disagree strongly.

Exceptions eliminate the need for very large quantities of rarely used (and never tested) code. The two motivating assumptions are:

  1. If an error cannot be locally handled at the point where it arises, it is very unlikely to be productively handled at any point that is more removed from the locus of occurrence. In consequence, most exceptions are de practico fatal.
  2. Given that most exceptions are uncatchable [when not handled locally], there is no point introducing an explosive amount of code around return conditions for the sole purpose of propagating unrecoverable outcomes back to the outermost contour of the application.

If this view is taken, then exceptions provide an important and useful syntactic shorthand for error propagation.

The main assumption here is [1]. Assumption [2] is actually consequential damage. In my experience, there are a very few design patterns in which [1] does not apply, but (forgive me) they are exceptional cases. The main one is "I'm going to run this in an an exploratory/optimistic way, and if that doesn't work I'm going to do something more complicated."

It is possible that I am biased by selective experience here. I've spent 20 years building high-confidence code, in which obsessive attention was given to exposing and testing assumptions and preconditions. In such code, the nearly ubiquitous pattern of the code is:

someproc(...) {
   if (!testRequirement1) failHorribly;
   if (!testRequirement2) failHorribly;
   // all preconditions checked
   proceed with steps that are guaranteed to succeed
   by virtue of preconditions being checked.
}

This pattern meshes with exceptions very nicely. Unfortunately it is rarely encountered in code that is designed to conventional standards of robustness.

Of course, David may hold that it's a good thing for woodpeckers to destroy civilization on a recurring basis so that we can rebuild. If that is his unstated assumption then I'll withdraw my objection. I'm sure I have some woodpecker repellent around here someplace.:-)

I distinguish between errors that cause abstraction violation (such as "out of memory", in languages with implicit authority to allocate) vs. domain errors that can be handled (such as "file not found" or div-by-zero).

I think this is an interesting mis-characterization. There is a difference between out-of-memory and divide-by-zero (on the one hand) and file-not-found (on the other). The difference is that they violate abstractions at different layers of the abstraction hierarchy. But both violate abstractions.

David and I seem to have a difference of opinion about which layer divide-by-zero lives in. I can make a case for his position, but my sense is that div0 is usually best viewed as an error that arises at the hardware instruction execution layer of abstraction, which is the same layer of abstraction where OOM errors conceptually occur. Reasonable people could disagree about where div0 lives in the hierarchy, so let's not get too hung up on that. The main point I'm trying to make is that abstractions exist in a hierarchy, that violations of abstractions can exist at multiple levels, and that exceptions are a not-bad mechanism for signaling this without regard to what level of abstraction was violated.

Exceptions are not (usually) syntactic shorthand

there is no point introducing an explosive amount of code around return conditions for the sole purpose of propagating unrecoverable outcomes back to the outermost contour of the application

We can explicitly model propagation of error without having it clutter our code around return expressions. We can do so without exceptions. In Haskell, I might use an ErrorT monad or ArrowError to represent propagation of generic failure.

If an error cannot be locally handled at the point where it arises, it is very unlikely to be productively handled at any point that is more removed from the locus of occurrence.

We can productively isolate errors on various boundaries, switch to fallback behaviors. But I agree that the precision with which we can address an exception reduces with distance from the locus of error. (This makes a fine argument against exceptions, IMO.)

In the opposite direction, we can usefully parameterize some computations or modules with advice on how to handle certain errors, i.e. when there are multiple distasteful options. This still allows for local handling of the exception, but affords non-local policy. Resumable exceptions are able to express these policies, which makes them marginally useful compared to normal try/catch mechanisms.

But I favor modeling the error-advisors explicitly, as a capability. And, in interest of avoiding clutter, I might use a ReaderT monad or StaticArrow or like abstraction.

exceptions provide an important and useful syntactic shorthand for error propagation

I don't mind supporting syntactic sugar (even user-defined) for various error handling patterns. But exceptions are typically more than syntactic shorthand, with deep implications for types, concurrency, distribution, modularity, security, etc..

But both violate abstractions. [...] violations of abstractions can exist at multiple levels, and exceptions are a not-bad mechanism for signaling this

An alternative to supporting "abstraction violations" is to have developers fix their abstractions. If an abstraction explicitly includes potential for failure, then reporting said failure is not a violation of abstraction.

my sense is that div0 is usually best viewed as an error that arises at the hardware instruction execution layer of abstraction

How do you justify your intuition in this case?

Agree to disagree

I don't really have the time to debate this, but a couple of short points:

1. When Haskell takes over the world, you can use an ErrorT monad. In the meantime we have exceptions.

2. We agree about distance from locus of error. Once an exception escapes the perimeter in which handling is possible, it still serves one useful purpose: error reporting. That purpose typically isn't served by most of the alternatives you describe. I imagine they can be extended, but I haven't seen it done.

3. Regarding div0: it's a hardware exception that arises solely as a consequence of executing a particular instruction. I'm only trying to say that the corresponding software exception originates deep in the language runtime, and therefore should be viewed as an exception arising from a very low level of [software] abstraction.

div 0

Disagree. For most programming OutOfMemory is a violation of an abstraction that says "we have infinite memory." And in many existing concurrent environments it's impossible to reliably test for sufficient memory prior to allocating.

Pre-allocating is the usual trick to prevent OOM, but that still means that OOM can happen during startup unless the entire environment is very controlled as in many embedded systems.

DivBy0, on the other hand, is just a point where function isn't defined and the type system is too stupid to know it. In most systems it's always possible to test for 0 before doing the division to avoid any hardware involvement.

Even when it gets to hardware, DivBy0 may or may not be trapped by hardware. No law of nature says it must be. It could return a special NaN value instead, say. IIRC Intel lets you disable DIV_BY_ZERO trap on floating point. And the JVM (which can and has been implemented in hardware) defines floating point DivBy0 as NaN instead of an exception.

footnotes

Habit prevents division by 0 in the type system, by restricting the divisor to a nonzero-integer type. Depending on the circumstances, no actual runtime checks may be needed.

Hardware varies indeed in the handling of division by 0. Integer division by zero on PowerPC gives an undefined result, but never traps; on x86, it always traps. Many architectures do not even have (integer) division in hardware. For floating-point division, it is always a matter of configuration, as per IEEE 754.

Division by zero definitely belongs to a "programmer-avoidable" class of errors, in contrast to "unavoidable" ones (TCP connection reset, printer on fire). Out-of-memory comes somewhere in-between.

Exceptions for Control Flow

There has been heavy discussion of exceptions as a tool to handle failures, but what about exceptions as a purely control flow construction? As an OCaml programmer I occasionally rely on exceptions as a purely static control flow device, jumping out of a computation to an explicit catcher in the near context, possibly carrying some value.
(In fact, some intermediate compiler language of OCaml has an explicit "static try/raise" construct that is used for the compilation of pattern matching.)

As a mild form of side-effects, this has well-known drawbacks, but it has some "good sides" in my opinion. For example, we know how to reasonably efficiently compile exceptions in a rather simple way, while supporting monadic-style failure handling in an efficient way requires compiler support for optimizations (it tends to create lots of use-once anonymous functions and would therefore naively result in undue amounts of closure allocation).

Hacking the language

It would be better if you did not need to use exceptions to efficiently express control flow. The multi-return models I mentioned earlier, and CPS with tail calls, each offer effective alternatives. But if exceptions are the best convenient, efficient mechanism, then use them. My advice for `No Exceptions` is aimed more at new languages.

supporting monadic-style failure handling in an efficient way requires compiler support for optimizations

Rather than depending on a smart compiler, you can model many of those optimizations explicitly by clever use of GADTs for staged computation. Even more so for Arrows or Applicatives, which have a more rigid structure.

The multi-return models I

The multi-return models I mentioned earlier, and CPS with tail calls, each offer effective alternatives.

Re. "CPS with tail calls", if you are thinking of the programmer explicitly using CPS style, this is essentially a way to go back to monadic programming style (you'll essentially use a form of continuation monad instead of a form of Maybe monad, handling things by the control end rather than the value end), with the same issues. If you are thinking of a feature to use continuations in direct style (eg. call/cc), this is even stronger than dynamic try/raise.

Now, it is possible to use a "partial CPS" by passing a continuation for the error case, but still using direct style for the normal return path. Compared with direct style exceptions, it still has the defect of needing adaptation of code to pass this error continuation around (eg. you can't do that with the obvious `for` loop, and library-provided iterators need to provide a facility for this as well).

Rather than depending on a smart compiler

Yes, that's my problem. I'm still not sold on the question of whether compiler should go out of their way to optimize code. I would prefer to concentrate on usable abstractions that have an already-efficient semantics.

No need for continuation

No need for continuation monad in this case (no need for general ability to access the current continuation as a value). One might use a monad, however, to pass the current error handling case around to avoid clutter on calls. This could be built atop some form of ReaderT.

I would prefer to concentrate on usable abstractions that have an already-efficient semantics.

I guess your issue is that stack-based languages ("stacks" generally modeling exactly one continuation with moderate efficiency and good cache-locality) don't offer similarly efficient heap allocation, so managing multiple return paths can become inefficient.

There are simple alternatives to stack-based implementations, however, that do offer an efficient semantics without any smart optimizations. I've had great success with a few language implementations that used a generational bump-pointer nursery per lightweight thread, with separate spaces for large allocations and memory communicated to other threads. One can get very near stack-level performance, even for ad-hoc multi-return cases, if we keep nurseries within the CPU cache.

I'm still not sold on the question of whether compiler should go out of their way to optimize code.

I did not attempt to sell you on a smart compiler that goes out of its way to optimize code. Staged programming with GADTs would allow developers to write their own dumb compiler directly in code.

Is this really the case? If

Is this really the case? If you implement a monad with delimited continuations then you get basically the same efficiency as native exceptions. In particular no closures are allocated like in explicit monadic style. This is of course assuming that the delimited continuations are implemented efficiently (i.e. not by CPS transform, but by stack walking).

Well, as soon as your main

Well, as soon as your main program sequencing tool is a form of `bind : a m -> (a -> b m) -> b m`, you quite naturally get lots of closures popping up. Are you thinking of a different way to write monadic programs?

Yes, like in Representing

Yes, like in Representing Monads. If I understand it correctly, you only have to do a bind at the point where you are actually inserting a monadic value into a computation, instead of at every point of the program. For example for exceptions, you only do a single bind at the point where you throw an exception, so when you don't actually throw there is zero overhead.

But then you have

But then you have "reset/shift" primitives (or any other presentation of delimited continuations) in your direct style program, isn't it? In that case, you haven't gained much over using try/raise, as you've introduced even more powerful control operators.

Yes

Yes. Isn't not having to rewrite your program in a contorted style a good thing? Is there an advantage of inversion of control?

"Would the replacement of

"Would the replacement of violations in my language with exceptions technically render it 'impure'?"

My take: it all boils down to semantics. What about "f x = 0" and the term "f (throw MyException)". Given strict evaluation that term would reduce to a "MyException" exception thrown; given lazy reduction, the term would reduce to "0".

Are either of these results, "MyException" or "0", pure or impure? Well, are there side effects? No. Can you substitute values like one would expect in math? Well, no, not really. Is the result symbollicaly correct, i.e. formally and therefor also 'purely' correct given a logic and evaluation order? Sure.

It really boils down to what you think constitutes purity (IO, reduction order, referential transparency). So, it's a matter of definition. But according to a broad enough definition of purity, even C programs are pure, so I agree I would call exception handling in a language a 'somewhat impure' feature.

(And at the same time, I would call your first NaN-like solution, replacement of a value, 'pure'.)