Lambda the Ultimate

inactiveTopic CLR Exception Model
started 10/2/2003; 12:18:04 PM - last post 10/6/2003; 9:14:29 AM
Ehud Lamm - CLR Exception Model  blueArrow
10/2/2003; 12:18:04 PM (reads: 14500, responses: 11)
CLR Exception Model
Chris Brumme has more than you ever wanted to know about this subject...

Well, if you share my tastes you are going to find the discussion of managed exceptions interesting and have a lot to say about the Performance and Trends section, but find the explanation of the Windows Structured Exception Handling (SEH) way to detailed. But then, if you are working on implementing a compiler for this platform at the moment, you'll find the details indispensable.

I guess Chris has a point when he says that if you actually want to know more, perhaps you should just apply for a job on the CLR team...


Posted to cross-language-runtimes by Ehud Lamm on 10/2/03; 12:20:26 PM

Ehud Lamm - Re: CLR Exception Model  blueArrow
10/2/2003; 12:28:23 PM (reads: 899, responses: 0)
Here's what Chris has to say about checked/unchecked excpetions:

The above text has described a pretty complete managed exception model. But there’s one feature that’s conspicuously absent. There’s no way for an API to document the legal set of exceptions that can escape from it. Some languages, like C++, support this feature. Other languages, like Java, mandate it. Of course, you could attach Custom Attributes to your methods to indicate the anticipated exceptions, but the CLR would not enforce this. It would be an opt-in discipline that would be of dubious value without global buy-in and guaranteed enforcement.

This is another of those religious language debates. I don’t want to rehash all the reasons for and against documenting thrown exceptions. I personally don’t believe the discipline is worth it, but I don’t expect to change the minds of any proponents. It doesn’t matter.

What does matter is that disciplines like this must be applied universally to have any value. So we either need to dictate that everyone follow the discipline or we must so weaken it that it is worthless even for proponents of it. And since one of our goals is high productivity, we aren’t going to inflict a discipline on people who don’t believe in it – particularly when that discipline is of debatable value. (It is debatable in the literal sense, since there are many people on both sides of the argument).

Ehud Lamm - Re: CLR Exception Model  blueArrow
10/3/2003; 1:27:42 AM (reads: 820, responses: 0)
For old timers who don't know what this finally business is all about here are a couple of explanations, one using Java and one using C#.

Tim Sweeney - Re: CLR Exception Model  blueArrow
10/4/2003; 2:33:45 PM (reads: 682, responses: 2)
Ehud,

Would you consider it uncontroversial for a language in the general spirit of C++/C#/Java to expose exceptions as follows?

- There are two kinds of functions, clearly distinguishable by syntax: imperative functions (which may have side effects and other referentially non-transparent behaviour) and pure functions (referentially transparent).

- Pure functions default to being able to throw no exceptions. To widen the set of exceptions throwable by a pure function, you need to add an extra keyword, like "throws(int,string)", or (in the most general case) "throws(any)".

- Imperative functions default to being able to throw exceptions of all types. To narrow the set of exceptions throwable by an imperative function, you need to add an extra keyword, such as "throws(int,string)" or (in the most narrow case) "throws(none)".

- A type of functions f may only be a subtype of a type of functions g if f throws the a subset of the exceptions of g.

My thinking is that this approach satisfies the largest possible audience, giving you the capability of expressing any possible exception widening/narrowing behaviour, but defaulting to what is typically wanted: pure functions where exception support is narrow and may only be widened explicitly, and imperative functions where exception support is universal and only be narrowed explicitly.

Thoughts?

Daniel Yokomiso - Re: CLR Exception Model  blueArrow
10/4/2003; 4:30:33 PM (reads: 680, responses: 1)
One thing that would make a checked exception handling system less controversial:

Add some form of parametric covariance, so a library writer can encapsulate the exceptions in their code without cluttering the client (pseudo-Java example):

package mypackage;
class MyException<E> {
    E getCause();
}
class MyApi {
    int someMethod(String foo) throws(MyException<IOException>, MyException<SQLException>) {
        try {
             return baz(bar(foo));
        } catch(SQLException exc) {
             throw new MyException<SQLException>(exc);
        } catch(IOException exc) {
             throw new MyException<IOException>(exc);
        }
    }
}

// client try { new MyApi().someMethod("Hello World!"); } catch (MyException<SQLException> exc) { System.out.println("Database error:"); exc.printStackTrace(); } catch (MyException<Exception> exc) { exc.printStackTrace(); }

In this way you can let the library writer pack the exceptions inside a type-safe container and the catch-clauses let you specify which kind of causing exception you want to deal with. Either that or named union types, so he can write:


package mypackage;
union MyException = SQLException | IOException;
class MyApi {
    int someMethod(String foo) throws(MyException) {
         return baz(bar(foo));
    }
}

// client try { new MyApi().someMethod("Hello World!"); } catch (SQLException exc) { System.out.println("Database error:"); exc.printStackTrace(); } catch (MyException exc) { exc.printStackTrace(); }

The union code is nicer, IMHO, but some people may prefer the other one. The rest of the idea is fine.

On the pure/imperative distinction I have a question. Will you make a distinction between local side-effects (i.e. like the State monad) and global side-effects (i.e. the IO monad)? Because pure functions can be implemented using local side-effects, so it should be ok to use local imperative code (e.g. implementing fibonnaci using while and updatable variables) but not ok using global imperative code (e.g. accessing a fibonnaci file). If you forbid both kinds of usage in pure functions you'll start to duplicate your libraries (e.g. having both pure and imperative versions of hashmaps, arrays, etc.) so people can use purely funtional binary trees inside their functions. IME as a language designer it's better to distinguish between pure, local-imperative and global-imperative than having a pure/impure dichotomy.

Tim Sweeney - Re: CLR Exception Model  blueArrow
10/4/2003; 10:07:40 PM (reads: 659, responses: 2)
Daniel,

Totally agreed. Type unions especially are an extremely valuable language feature, and aren't too hard to implement. It's amazing how frequently C++/Java/C# programmers create a subclass hierarchy when what they really want to define is a disjoint union, or a type union.

Regarding the distinction between imperative functions and pure functions: programming with local exceptions and local heaps can be done inside a pure function, and the typechecker can assure that no imperative effects can escape out of that pure function. The Haskell State monad provides a good example of how this can be implemented (though there are other ways, and I use an approach that looks more like Pascal/Java/C# syntactically).

The exception handling details are straightforward (i.e. a pure function must be statically provable to catch all possible exceptions its body is capable of throwing before it exits).

The heap details are more complex, and sometimes end up limiting what the compiler can recognize as being locally imperative within a pure function. The most general solution is region typing, but its syntactic overhead is big, so right now I feel OK about accepting the pure/imperative dichotomy and the limitations that go with it.

In general, the set of exceptions throwable by a function, and its heap accessibility (pure or imperative) are things one wants to parameterize over.

For example, you want to be able to write a 'map' function that is pure when its input function is pure, but imperative when its input function is imperative.

This has a nice solution if you embrace the most general subtyping relationship possible (f:a->b <: g:c->d iff b<:d and c<:a and f's set of effects (heap accessibility, set of throwable exceptions) is a subset of g's set of effects. In addition to being able to declare exceptions as "throws(int|string)", you can say "throws(int|f.exceptions)" to say that you can throw integers or any exceptions throwable by f.

Do you think that capability would solve the problems you've pointed out with the pure/impure dichotomy? I haven't run into those myself (yet), but in my work so far I've had somewhat simplistic boundaries between functional and imperative code.

Daniel Yokomiso - Re: CLR Exception Model  blueArrow
10/5/2003; 6:43:12 AM (reads: 644, responses: 0)
The heap details are more complex, and sometimes end up limiting what the compiler can recognize as being locally imperative within a pure function. The most general solution is region typing, but its syntactic overhead is big, so right now I feel OK about accepting the pure/imperative dichotomy and the limitations that go with it.
Local heaps are a nice idea. If we can type the heap usage of some operation it can be used to guarantee real-time constraints. I think a way to provide local imperative operation is to map them to two-way monads, so we can see a Algol-like syntax as syntatic sugar. That way you can let the programmers use different kinds of two-way monads, not just state-like ones.
In general, the set of exceptions throwable by a function, and its heap accessibility (pure or imperative) are things one wants to parameterize over. For example, you want to be able to write a 'map' function that is pure when its input function is pure, but imperative when its input function is imperative.
Hmm, in Haskell I would write "sequence_ (mapM list action)", but we could make "map" a type class member so we can instance it for "a -> b" and "a -> m b" and let it do the right thing (including sequencing it), so I don't see how this could be different, except perhaps for the syntatic sugar.
In addition to being able to declare exceptions as "throws(int|string)", you can say "throws(int|f.exceptions)" to say that you can throw integers or any exceptions throwable by f.
IIRC in Needle you could write "f : (a -> b throws c) -> b throws(int|c)" or something like it, so I guess you don't need a "exceptions" keyword. I'm using a notation very similar to this one, so I guess we're in the right track ;)
Do you think that capability would solve the problems you've pointed out with the pure/impure dichotomy? I haven't run into those myself (yet), but in my work so far I've had somewhat simplistic boundaries between functional and imperative code.
I don't know, it appears to be so, but I think some investigation is necessary. For example you can define data structures with imperative or functional operations. If the programmers can write a imperative set that is safe to be used by pure functions you won't have problems. But if your system does not allow it they'll have to write two sets, one imperative and one functional, with different time and space behavior. If you succeed in designing such system it'll be very powerful.

Neel Krishnaswami - Re: CLR Exception Model  blueArrow
10/5/2003; 10:58:31 AM (reads: 619, responses: 1)
No, Needle never had exception declarations. This is because I think you need exception variables to get sufficient precision in the exception typing -- you need to be able to write something like "foo : (Int -> Int throws a, Int) -> Int throws a" so that the information about what the argument can throw doesn't get lost. However, you also need to handle general set constraints because of the way exception handlers work:

fun foo(f : Int -> Int throws a, x Int) {
  try {
    f(x)
  } catch (DivideByZero) {
    0
  }
}

Now you'd want to say foo has type "foo : (Int -> Int throws a, Int) -> Int throws a - {DivideByZero}".

This is nasty because the set difference operation means you don't have any kind of principal typing, and that means that your compiler is going to infer these huge, ugly, unreadable sets for the throws declaration.

Daniel Yokomiso - Re: CLR Exception Model  blueArrow
10/6/2003; 5:14:40 AM (reads: 562, responses: 0)
No, Needle never had exception declarations. This is because I think you need exception variables to get sufficient precision in the exception typing -- you need to be able to write something like "foo : (Int -> Int throws a, Int) -> Int throws a" so that the information about what the argument can throw doesn't get lost.
I seem to remember something like:

o : (b -> c throws x) -> (a -> b throws y) -> (a -> c throws x + y)
in a post of yours. Anyway how is Needle going? The mailing list appears to be dead (last message in july, before that in february).
Now you'd want to say foo has type "foo : (Int -> Int throws a, Int) -> Int throws a - {DivideByZero}".

This is nasty because the set difference operation means you don't have any kind of principal typing, and that means that your compiler is going to infer these huge, ugly, unreadable sets for the throws declaration.

How this ("huge, ugly, unreadable sets") weight against the benefits from precise exception typing? Isn't it similar to the debate on templates in C++, they're useful, but the compiler infered instances are unreadable?

Neel Krishnaswami - Re: CLR Exception Model  blueArrow
10/6/2003; 9:14:29 AM (reads: 520, responses: 0)
Large exception type expressions aren't intrinsically bad, if they are conveying large amounts of information. What's bad is when there are lots and lots of huge, equivalent expressions, because then the programmer can't easily identify equal sets of exceptions. That's why principal typing a la ML is such a big deal -- it gives you a canonical form for each type, so that the programmer can easily figure out which expressions have the same type.

The lack of such a canonical form is a great big warning sign to the type system designer. It's not a fatal problem, since you can add type declarations (which will presumably be readable!) and check that the inferred type matches the declaration, but it does mean that you've moved up the ladder of complexity, and that perhaps you should rethink the approach.

Kory Markevich - Re: CLR Exception Model  blueArrow
10/7/2003; 10:17:08 AM (reads: 463, responses: 0)
While I agree that the causing exception should be preserved, what does making it's type explicit gain? From what I can tell it would break encapsulation, by making an internal detail of the method a part of the methods type. Thus changes to the method could require updating of the throws clause or the union type. Further it wouldn't solve any of the problems that most people encounter when trying to use checked exceptions as if they were unchecked exceptions, such as the exception explosion when they are allowed to propagate incorrectly.

With your parametric example, after a few methods you'd end up with huge, unwieldy types like

ItDidntWorkException< BoomException< YouReallyDontCareAboutMeException< TheSourceOfAllEvilException > > >

And the union choice would be even worse, as the type would reduce to the recursive union of all exceptions that may be thrown under the method. Use of a union type would just move the list from the throws clause to another location, which does not deal with the problem itself. It is also a step back in that it throws out the chain of exceptions in favour of just propagating the original.

On the other hand I don't see any such problems with the convert and chain idiom. The original causing exception is preserved and can be inspected if desired. Encapsulation is preserved by keeping implementation details from polluting the type interface. The exception explosion is eliminated.

Am I missing something?

andrew cooke - Re: CLR Exception Model  blueArrow
10/17/2003; 1:19:14 PM (reads: 340, responses: 0)
It's amazing how frequently C++/Java/C# programmers create a subclass hierarchy when what they really want to define is a disjoint union, or a type union.

what should you do instead? isn't this the appropriate way to scratch that itch in those languages?