unchecked exceptions are like runtime type checking?

I've been having a running discussion with a coworker of mine. He's a big fan of Ruby, while I find its runtime type checking worrisome. His recent response to my concerns about runtime type checking is that in the Java arena people are pushing using unchecked (runtime) exceptions, and that unchecked exceptions are just as unsafe as runtime type checking. In a way I can see his argument, because in Ruby if you misspell a method name it will throw a method not found error, so you could look at Ruby's runtime type checking as simply a source of unchecked exceptions.

But I have this sense that it's not quite that simple and can't put my finger on why.

Runtime... Compile Time... Who Cares?

Runtime checks are no worse than compile time checks as far as I'm concerned, and in a language with no specific compile time, the argument is moot. What advantage is there to type-check at compile time rather than runtime? The error is caught either way. And anyway, what is unit testing for (something that is, interestingly, more popular in the Ruby community than programmers of other languages) if not to catch weird things like this anyway? As long as objects respond to the messages they need to respond to, they will work, even if they are not the "correct" type.

By Jake M at Fri, 2006-08-25 05:36 | login or register to post comments

Given a Two-Stage Language System...

... the difference is that type-checking at compile time results in errors being caught by the programmer rather than the user. There are other possible benefits, e.g. code optimization, but the big one is that your user never sees whole classes of errors.

In the currently popular statically-typed languages, this benefit is mostly trivial (IMHO), but one of the things that makes the MLs, Haskells, Epigrams, etc. of the world interesting is precisely their ability to express much richer constraints on program behavior at compile time than other languages.

But note my subject line: "Given a two-stage language system..." where things really get interesting is with languages that expose more than two stages, i.e. where some level of partial evaluation and/or runtime code generation is available in some principled way. Many of the questions swirling around dynamic typing and static typing become vastly more interesting given a language such as MetaOCaml.

Update: I neglected to mention another important fact: type systems (which I take to be compile-time by definition, cf. TAPL) prove the absence of (undesirable) properties of the code, whereas tests can only probabilistically demonstrate the presence of (undesirable) properties of the code. Again, how significant this is is highly dependent upon your application domain, as well as the power of your tools. Much of the ranting that I engage in on LtU boils down to reminding people that even languages with type systems that have been around for some 25 years now, such as Standard ML and O'Caml, can actually express interesting constraints at compile time—much more so than is the case in C, C++, Java, C#...

By Paul Snively at Fri, 2006-08-25 13:21 | login or register to post comments

I'd like to see a unit test

I'd like to see a unit test that effectively demonstrates a lack of side-effects up to a specific abstraction. Or at least, one that it's actually practical to use.

One project that I'll probably never get round to coding up is something akin to a lisp system, but based around a static type system (I'm tempted by System Fc as a starting point currently). Playing around with staging is one of the things I'd be doing with it.

By Philippa Cowderoy at Fri, 2006-08-25 13:52 | login or register to post comments

Behavior vs. Implementation

Unit tests are not meant to prove lack of side effects though. The only thing I use unit testing for is to prove that objects respond to the correct methods. When it is important that it responds in a particular way to a particular method, I then check the methods of that method's return value(s), and so forth until it no longer matters how the return values' methods are implemented. I do not bother checking things like type since it actually does not matter. It's the behavior, not the implementation, that is important in OO design. I argue with this point only because Ruby is a pure object-oriented language. In an impure or non-object-oriented language, unit testing is, in my opinion, useless, and things such as compile-time type checking become much more useful.

By Jake M at Fri, 2006-08-25 18:14 | login or register to post comments

You're repeating the same

You're repeating the same old tired mistakes. Types can express exactly the things you're checking.

By Philippa Cowderoy at Fri, 2006-08-25 18:57 | login or register to post comments

The main mistake you're

The main mistake you're making here I think is that unit tests are not a direct substitute for type checking. Type checking occurs on code paths there are no tests for. It is very possible in a language like Python or Ruby that a typing error only results when a very specific combination of things occur. In C++, you would still get a type error, because type checking is done everywhere and at compile time. In Python or Ruby you may not know about it for a long time.

But in Jake's defense -- I mostly made type errors as a newbie programmer. I just got done with a 10,000 line C++ project and had far more logic and design errors than type ones. The only type error that's every really nipped me in the butt in Python was accidentally assigning to a function instead of a function call. This mistake is very easy to make:

x = f # whoops
x = f() # what I meant

This would be easily fixable by making assigning to functions require a special syntax, something like:

x = $f

With the $ I'm specifically indicating my intent to assign to the function rather than a call.

And again to Jake's defense, even though it's true that you can't "prove" the program is without type errors without unit tests, outside the ivory tower and absolutely mission critical software, no one cares. Most software isn't navigating the Mars rover. For most programming situations sacrificing provability of correctness for development speed is a tradeoff worth considering.

By dataangel at Sun, 2006-08-27 05:14 | login or register to post comments

I think I see the problem.

can't "prove" the program is without type errors without unit tests

I think that last "without" should be "with". :)

For most programming situations sacrificing provability of correctness for development speed is a tradeoff worth considering.

That's the thing... good statically checked languages very rarely limit development speed to such a degree that I'm willing to give up static type checking.

Now, if you're coming from the world of C++ and the likes then it's fully understandable that one might think that statically checked typing is terribly limiting wrt. development speed, but I haven't found that to be the case with OCaml. (The actual value of type checking in C++ is also somewhat debatable since it is so easy to subvert the type system entirely, but I digress...)

By Bárður Árantsson at Sun, 2006-08-27 06:42 | login or register to post comments

Static Typing Isn't Enough

The problem I see with this argument (and most of the other arguments here) is that we are comparing strong, dynamic typing with weak, static typing. C can not catch all type errors, and programming under the assumption that it can will lead to trouble once you do some typecasting, which is necessary in most fairly complex C/C++ programs.

My point is not that no Ruby programmer will ever encounter a type error, nor that C programs are chock full of type errors. I am just trying to make it clear that compile-time type-checking is no more comprehensive that unit testing. The only way to really check for type errors reliably at compile-time is with a strongly, statically typed language such as, dare I say, Java.

Another point I am trying to make is that, because Ruby is strongly typed, the worst behavior that would probably happen even if a type error was not caught by unit testing is that the program throws a NoMethodError and some debugging information, no strange buffer overflows, segfaults, etc. as in C.

By Jake M at Sun, 2006-08-27 17:59 | login or register to post comments

That Depends on the Static Typing

I am just trying to make it clear that compile-time type-checking is no more comprehensive that unit testing. The only way to really check for type errors reliably at compile-time is with a strongly, statically typed language such as, dare I say, Java.

I would suggest that to see what sorts of benefits you can actually get out of static checking you need to beef Java up with JML and use ESC/Java2, or consider a language tht actually has a decently strong type system like Haskell, or an ML. Even Ada or Eiffel would offer a significant improvement over Java in terms static checking and strength of typing.

But honestly, if you get some time, try out some of those - I think you'll find it enlightening as to how much in the way of errors can actually be caught effectively. Which is not to say Ruby is bad, on the contrary it is great as a dynamic language, it jut fills a different niche when it comes to assurance of correct functioning. Mostly I'm just trying to say that there is far more powerful and effective static checking available, which makes the benefits of static checking rather more clear for those cases where assurance matters.

By Leland McInnes at Sun, 2006-08-27 18:26 | login or register to post comments

But Wait, There's More!

Jake McArthur: I am just trying to make it clear that compile-time type-checking is no more comprehensive that unit testing. The only way to really check for type errors reliably at compile-time is with a strongly, statically typed language such as, dare I say, Java

One major quibble, and one minor. First, the major one: there really are static type systems that can prove the absence of undesirable properties at compile time. My canonical example of this is TyPiCal, which proves, at compile time, the absence of deadlock and race conditions in concurrent code. No amount of unit testing can prove the same; all unit testing can do is (maybe, depending upon code coverage and the phase of the moon) prove that you do have deadlock and/or race conditions.

Which leads me to my minor quibble, which is that even Java's type system is pretty weak, and unsound (downcasts are unsound). So just be aware that when you read a lot of what we static-typing aficionados here say, we aren't talking about C, C++, Java, C#... we're almost certainly talking about Standard ML, Objective Caml, Haskell 98, GHC, Epigram, MetaOCaml... as well as various theorem provers such as Twelf, Coq...

So what do I mean by that? Maybe it would be a good idea to attempt to enumerate the assumptions that we make in these conversations. For example, I take it as a given that desirable language properties include:

Memory safety
Strong typing
Separate compilation
Parametric polymorphism
Type inference

In addition, I want one of:

ML-style modules
Haskell-style typeclasses

I suppose I should just say "universal and existential quantification" and leave it at that, but how those are provided matters to some degree.

So when you say "Static typing isn't enough," to the extent that you mean weak static typing as in C or C++, and/or strong but inexpressive static typing as in Java or pre-3.0 C#, you're not likely to get much disagreement. But be prepared to have it pointed out to you that most people here are accustomed to discussing vastly more powerful strong static type systems than those in any of these languages.

By Paul Snively at Sun, 2006-08-27 18:32 | login or register to post comments

Keep in mind there's nothing

Keep in mind there's nothing that says dynamic typing can't give you information at compile time. Remember, dynamic typing is the absence of errors at compile time--it says nothing about warnings.

edit:
Granted, I know of no dynamically typed languages that check to the extent some statically typed languages do, but that's not to say it's impossible.

By Curtis W at Sun, 2006-08-27 19:04 | login or register to post comments

Errors vs. Warnings

True enough. But IMHO, the desirability of this is directly related to how powerful your type system is: if it's not very powerful, then you probably want warnings rather than errors, precisely to avoid the compiler from preventing you from expressing something perfectly legitimate. But if it's very powerful—by which I mean, the types it can express are very accurate—then I probably want errors, because, well, the only way forward would be to generate nonsense code, which is what I'm trying to avoid!

By Paul Snively at Sun, 2006-08-27 19:37 | login or register to post comments

You could generate nonsense

You could generate nonsense code for debuging purpose.

By Felicia Li Svilling at Sun, 2006-08-27 19:47 | login or register to post comments

Yes and No

Well, yes and no. On the one hand, expressive type systems do cut down quite a bit on inaccurate errors. Even still, what might be considered an error to you might be intended behavior, which is why I believe compile time errors simply assume too much. Compile time errors literally force you to "correct" your code, whether it's wrong or not.

By Curtis W at Sun, 2006-08-27 20:00 | login or register to post comments

In the Limit

Curtis W: On the one hand, expressive type systems do cut down quite a bit on inaccurate errors. Even still, what might be considered an error to you might be intended behavior, which is why I believe compile time errors simply assume too much. Compile time errors literally force you to "correct" your code, whether it's wrong or not.

But this is incoherent; if the types get increasingly accurate as the type system increases in expressive power, culminating in a Turing-complete type system such as the Calculus of Constructions, then the type system can express anything that can be computed, and it's literally impossible for there to be anything left to express at runtime that couldn't have been expressed at compile time—that is, the only reason you even need a runtime at that point is to deal with dynamic input. But since your type system is now Turing-complete, there's no kind of dynamic input that can't be determined to be of some type that's expressible at compile time, so once that input has been parsed into a value of that type, the rest of the program is perfectly safe.

Let me be clear about this: I'm not suggesting that developing or using increasingly-accurate type systems is necessarily easy, either in the sense of type inference for them being easy or in the sense of developing accurate annotations in user code being easy. TANSTAAFL. All I'm saying is that the more accurate the types, the fewer "errors that aren't errors" you get from the compiler; they are in a direct proven inverse relationship to each other. The argument that I keep hearing from dynamic typing aficionados is "Well, it doesn't matter how accurate your type system is; you'll always be able to express things at runtime that your type system can't check," and that's just so patently ludicrous, in the strong sense of being incorrect due to the Curry-Howard Isomorphism and the Futamura Projections, that I sometimes despair of people making that argument ever actually getting it. I mean, honestly, if "more accurate types" doesn't mean "fewer 'errors' that aren't errors," then what, exactly, does it mean?

By Paul Snively at Sun, 2006-08-27 21:08 | login or register to post comments

CoC is SN

Turing-complete type system such as the Calculus of Constructions

The Calculus of Constructions is strongly normalising.

By Ronny Wichers Schreur at Sun, 2006-08-27 22:57 | login or register to post comments

Good Catch!

You're quite right, of course; I erred in identifying the Calculus of Constructions with the Turing-complete type systems of, e.g. Cayenne.

Given a Martin-Löf-type-theory-based theorem prover such as Coq, then, we face the question of how to formalize general recursion in such a system. There's a series of papers by Venanzio Capretta on the subject. An excellent starting point is Modeling General Recursion in Type Theory.

Thanks for cleaning up after me!

By Paul Snively at Mon, 2006-08-28 00:20 | login or register to post comments

Yes, but...

Yes, but this has little to do with my main point. Unless the programmer has explicitly typed something, you can't know if it's actually unintended (and thus an error). Regardless, I just wanted to point out that compile time information isn't exclusive to static typing. Compile time errors generally are (barring soft typing) but whether that actually matters is another discussion.

By Curtis W at Sun, 2006-08-27 23:58 | login or register to post comments

Why not optional typing?

I'm pro-types, but think optional prove-what-you-want typing seems like the way to go. Particularly:

* In expressive type systems, inference becomes undecidable and annotation becomes hard - even for a programmer annotating a correct program he intuitively understands. For example, if divergent computations do not belong to every type, then establishing well-typedness of a computation often requires a termination proof. I do not want to be forced to supply such a proof to use my program. On the other hand, I want this level of power in my type system.

* A static typing system doesn't just catch errors before your application ships -- it catches them before your application runs. Interacting with a program with errors can be useful in understanding the errors.

I'd love to hear feedback on what others think of this viewpoint, or links to papers or threads where it's been discussed. I'm particularly interested in hearing about its shortcomings. Thanks.

By Matt M at Tue, 2006-12-05 01:51 | login or register to post comments

Pluggable Type Systems

StrongTalk has been discussed a number of times on LtU but the best starting was the thread on pluggable type systems - Gilad was right.

By Chris Rathman at Tue, 2006-12-05 02:08 | login or register to post comments

Gradual Typing and Epigram

Does gradual typing helps? IMHO it's the most important result in type systems this year, but it will take some time before it can be mixed with advanced type systems.

A static typing system doesn't just catch errors before your application ships -- it catches them before your application runs. Interacting with a program with errors can be useful in understanding the errors.

Epigram provides an interactive interface to the compiler where the programmer can define special areas of code that can't be verified by the compiler yet so a program can be partially verified. And Epigram is one of the most expressive PLs today.

By Daniel Yokomizo at Tue, 2006-12-05 16:10 | login or register to post comments

Thanks! (and to Chris)

By Matt M at Thu, 2006-12-07 02:52 | login or register to post comments

Compile time analysis and type checking

A compile time analysis is nearly a (static) type system by definition, depending what definition you use. Many type systems can be presented as an abstract interpretation, and many analyses not usually considered type checking, like use or aliasing analyses can be presented with standard natural-deduction style typing rules.

Does anyone know of a good name or good work about languages that include a fair bit of static analysis, but also rely on some runtime checks. Running a static type system just to report errors is a degenerate case that can still be called "dynamic" (but projects seem to start running with -Werror as they grow). Deferring checking at a few points to runtime as in MetaOCaml or the dynamic type in any of Haskell, OCaml, Clean is more interesting.

By Brandon Moore at Sun, 2006-08-27 19:55 | login or register to post comments

Not by my definitions of

Not by my definitions of static and dynamic typing, which I assume to be fairly standard. Namely, static type systems present any incontinuity in the type system as an error at compile time, whereas dynamically typed systems present all errors at runtime when they happen. There is another type system which can be seen as a mixture of the two called soft typing*, although I wouldn't consider this to be what you're referring to in the second half of your post.

* Some of you might remember me supporting this option, although I have since switched back to complete dynamic typing for several reasons.

By Curtis W at Sun, 2006-08-27 20:11 | login or register to post comments

Ada does a fiar amount of

Ada does a fiar amount of static type checking, but the type system also allows you to express constraints that can only be checked at runtime (somewhat confusingly called subtype constraints). The language ensures safety, of course, so as a programmer you don't have to worry about checking the constraints.

The simplest case is something like:

type Width is new Integer range 10..20;

The range cannot always be checked at compile time, so this check is not part of the static type checking, but is done at runtime. Good compiler obviously report clear cases of violation of the constraints at compile time.

A more interesting check is arrays bound checking.

By Ehud Lamm at Sun, 2006-08-27 20:11 | login or register to post comments

Interesting

Ah, that's interesting. I knew Ada had support for such constructs, but didn't know that it postponed some type checks until runtime. From what I can tell it's still a fairly static language, though.

By Curtis W at Sun, 2006-08-27 20:25 | login or register to post comments

Of course it is.

By Ehud Lamm at Sun, 2006-08-27 20:33 | login or register to post comments

The Right Question

IMHO, this is the right question. There does seem to me to be a cluster around abstract interpretation, dependent types, multi-stage programming, and... um... I still don't know what category Ontic falls into. Is it a dependent type system, or is it something else?

In any case, the central observation seems to be that the static vs. dynamic typing debate seems to be contingent upon the classical phase distinction of compile time vs. runtime. Multi-stage programming throws a monkey-wrench into the phase distinction by definition; abstract interpretation essentially attempts to arrive at a "good enough" approximation of dynamic semantics at compile time; dependent types include terms depended upon at both compile- and runtime... in short, rethinking the phase distinction induces a rethinking of typing. Perhaps this is closer to what Kay Schluehr was driving at in another post in this thread.

By Paul Snively at Sun, 2006-08-27 20:24 | login or register to post comments

Something else

I still don't know what category Ontic falls into. Is it a dependent type system, or is it something else?

I think it can be seen as a dependent type system with the addition of subtyping. However, this change is pretty large. It doesn't become a matter of conventional types as such anymore, but just what set of values a variable may hold in this case, therefore whether this operation is valid. It becomes very much like programming in an untyped language, except crucially all operations are checked and verified as correct at compile time. I think, even, that the need for annotations would be very low in such a system. Sure, you can express types that depend on terms, but that's not the big deal anymore.

Tim Sweeney comments on these ideas in this old thread. Frank Atanassow's counter viewpoint is, I think, that such a type system is in a sense too powerful, and a extra layer of abstraction is required in order to not lose desirable properties; particularly that isomorphism should be preferred over Ontic's reliance on type equality.

By rebooted at Mon, 2006-08-28 08:43 | login or register to post comments

Desirable language properties

What do you mean exactly by "Strong typing"? And probably related, how is Java downcasting unsafe?

I see Java downcasting as an artefact of the lack of dynamic dispatch in Java. To invoke method m on object o, you need to first make o into an object that responds to m.

Why do you "need" type inference? Parametric polymorphism? (what would you think if the only parametric polymorphism had to do with default values for missing arguments?)

By Denis Bredelet -jido at Wed, 2006-12-06 11:56 | login or register to post comments

Strong Typing, Downcasts, and Inference

Denis Bredelet-jido: What do you mean exactly by "Strong typing"?

I don't know that I'll be successful in being exact, but let me once again make an informal attempt. By "strong typing," I really mean essentially two things: first, that there aren't automatic conversions, e.g. that "123" + 456 doesn't result in either "123456" or 579. Second, that unconstrained casts, such as between integers and pointers in C, are disallowed. There may, of course, be normal functions to convert, e.g. between integers and strings. By my definition, the overwhelming majority of languages discussed on LtU are strongly typed: Scheme is just as strongly typed, if not moreso, as O'Caml. C, C++, and JavaScript, OTOH, are weakly typed.

So it's important to clarify that I am a fan of strong static typing (O'Caml, Haskell, Scala...) vs. weak static typing (C, C++...)

Denis Bredelet-jido: And probably related, how is Java downcasting unsafe?

The technical term is "unsound," and it's not just Java: any language that allows casting from a supertype to a subtype is unsound, which just means that you can't know at compile-time that the value in question is actually of the subtype, so there has to be a runtime component to deal with the failure case: a ClassCastException is thrown in Java, for example. Some languages, such as O'Caml, are sufficiently concernred with type soundness that they disallow subcasting; see the O'Caml manual, for example: "Indeed, narrowing coercions would be unsafe, and could only be combined with a type case, possibly raising a runtime error. However, there is no such operation available in the language." Such languages generally have more sophisticated type systems that support safe alternatives to downcasting; my comment in the "Open data types and open functions" thread contains relevant links.

Denis Bredelet-jido: I see Java downcasting as an artefact of the lack of dynamic dispatch in Java. To invoke method m on object o, you need to first make o into an object that responds to m.

Quite right. I see dynamic dispatch as the lack of structural typing, open self types, and properly-typed binary methods; see the links referred to above to see why. :-)

Denis Bredelet-jido: Why do you "need" type inference? Parametric polymorphism? (what would you think if the only parametric polymorphism had to do with default values for missing arguments?)

"The more interesting your types get, the less fun it is to write them down!" — Benjamin Pierce

I think parametric polymorphism is far too powerful to limit to missing arguments. The reasons why are, once again, best articulated in the papers linked to from the comment linked to above. The power you get in languages like O'Caml and Scala, which essentially couple parametric polymorphism with a couple of other type system design elements to arrive at type-safe virtual types, is really quite something to behold, and I worry about losing that power if you tinker too much with any of those design elements.

By Paul Snively at Wed, 2006-12-06 16:58 | login or register to post comments

Parametric polymorphism vs generic types

By "strong typing," I really mean essentially two things: first, that there aren't automatic conversions, e.g. that "123" + 456 doesn't result in either "123456" or 579.
The "automatic" conversions can be restricted using some rules. For example, if '+' is defined both for string and numbers, we may decide to use the left side for typing it. Then 456 can be converted to "456" only if there is a conversion function from numbers to string defined. There can be a restriction on conversion functions that the 'from' and the 'to' types have no intersection.

Such languages generally have more sophisticated type systems that support safe alternatives to downcasting; my comment in the "Open data types and open functions" thread contains relevant links.
I read On the (un)reality of virtual types, and I am a bit confused. Probably did not understand enough of the paper to see the point (the formulas in the HTML version are illegible which does not help ;) ). It seems to deal mostly with generic types, with some questions about coupled or recursive generic typing. Also I noticed that eg. class manager calls methods on unconstrained types, which is probably possible because of O'Caml type soudness checks. And the authors insist that structural classes are preferable to declarative classes though I did not understand what they base that statement on.

I think parametric polymorphism is far too powerful to limit to missing arguments.
Do you include generic types in parametric polymorphism?
I had the idea that it was related to allowing different sets of parameters for a function, which does not strike me as an important language feature.

By Denis Bredelet -jido at Thu, 2006-12-07 02:21 | login or register to post comments

Method overloading

is the name for allowing different sets of parameters for a function. Quite unrelated to parametric polymorphism, sorry. I guess when I studied it that was in French which explains the confusion.

By Denis Bredelet -jido at Fri, 2006-12-08 22:24 | login or register to post comments

I am just trying to make it

I am just trying to make it clear that compile-time type-checking is no more comprehensive that unit testing

I think others have disproven this by now, but I'll just chime in with one more data point; have your ruby-like first class messages, all statically proven:

http://www.cs.jhu.edu/~pari/papers/fool2004/first-class_FOOL2004.pdf

A simple object encoding and type system based on polymorphic variants can probably type most of Ruby, all without running a single unit test of whether an object responds to the proper message. I think this demonstrates that static typing is more comprehensive than unit testing, as you can miss a unit test, but you can't run an ill-typed expression.

By naasking at Mon, 2006-12-04 22:51 | login or register to post comments

Ruby is more dynamic than that

A simple object encoding and type system based on polymorphic variants can probably type most of Ruby

Most popular uses of Ruby today make use of meta-programming to extend classes at runtime. While I agree that such metaprogramming uses can be brought back to compile time in Ruby you dynamically add a method do a class (along with other class/method/object modifications). Also how can we type dynamic attribute-based finders without having something more than A simple object encoding and type system based on polymorphic variants ?

Don't get me wrong I love static typing but dynamic languages are really dynamic and there's useful features that can't be properly type-checked with simpler type-systems (i.e. without some form of dependent typing, even if it's minor).

By Daniel Yokomizo at Tue, 2006-12-05 16:23 | login or register to post comments

While I agree that such

While I agree that such metaprogramming uses can be brought back to compile time in Ruby you dynamically add a method do a class (along with other class/method/object modifications).

With the match functions I pointed to, you simply wrap the existing object in a closure that responds to the method you're adding, and forwards the others to the wrapped object.

I agree that Ruby and other dynamic languages have far more primitive "features" than simple message-passing, but if they had stuck to the core pure OO-message passing semantics and implemented everything on top of that, I don't see why match functions couldn't do exactly what I said.

The attribute finders are interesting, and can be done by decomposing the message to the object. I think it might be doable statically if all objects responded to a standard core protocol (get_attributes or something).

By naasking at Fri, 2006-12-08 16:21 | login or register to post comments

State is hell

With the match functions I pointed to, you simply wrap the existing object in a closure that responds to the method you're adding, and forwards the others to the wrapped object.

This is fine if you don't have side-effects but in Ruby (or Obliq my favorite OO language) a object can change itself so that you can have one-shot methods. In order to type that we need to use some form of temporal logic, because the type of a reference x between statements n and m may change due to either intervening statements or concurrent modifications from other threads/processes. Also we need some form of dependent typing because we can add/remove different methods based on values even if we go purely functional (using "|" for record extension):

foo :: Int -> {x :: Int} -> ????
foo bar baz =
    case bar of
        1 -> {y = baz.x | baz}
        2 -> {z = baz | baz}
        _ -> {x = "Hello World"}

I agree that Ruby and other dynamic languages have far more primitive "features" than simple message-passing, but if they had stuck to the core pure OO-message passing semantics and implemented everything on top of that, I don't see why match functions couldn't do exactly what I said.

There's more to objects than just message-passing. Also OO type systems can become quite complex.

The attribute finders are interesting, and can be done by decomposing the message to the object. I think it might be doable statically if all objects responded to a standard core protocol (get_attributes or something).

We can always encode some feature of dynamic languages into reflection/introspection, but IMHO it just shows that you aren't properly typing it. A better approach would use subtyping (or row polymorphism) to let the user specify a template:

find_all :: for-all obj, template . obj <: template => Class obj -> template -> [obj]
users :: Class User
find_all users {name = "Jim", age = 15}

But this encodes just this usage of analyzable first-class messages and would require a non-trivial type system to let us write the body of find_all. Also the approach fails when we start to encode other pieces of logic in the message name. In Smalltalk we could let the programmers write User findAllNamed: "Jim" orBornBefore: "1993-01-01" orderBy: 'lastName' descending and handle it with doesNotUnderstand.

Such types of programs are very hard, if not plain impossible, to statically check because they aren't written in a typeful way. There much of reflection and introspection in dynamic languages and there's no happy marriage between those features and static typing. A good survey of these problems and current solutions is in this thesis.

I'm a proponent of static typing (Haskell is my favorite PL and the one I use at home even for scripting), but AFAIK there's no type system today that can type dynamic languages without throwing some babies out with the bath water. Also there are strong evidence that dynamic languages offer benefits (e.g. doesNotUnderstand, hot code replacing with changing behavior) that can't be typed today.

By Daniel Yokomizo at Sat, 2006-12-09 17:41 | login or register to post comments

Eh...

Daniel Yokomizo: Also there are strong evidence that dynamic languages offer benefits (e.g. doesNotUnderstand, hot code replacing with changing behavior) that can't be typed today.

I'm having difficulty seeing "doesNotUnderstand" as a benefit. Could you perhaps explain what you mean here?

As for "hot code replacing with changing behavior" not being typable, I guess I haven't ranted about Acute enough recently:

This requires a synthesis of novel and existing features:

type-safe interaction between programs, with marshal and unmarshal primitives;
dynamic loading and controlled rebinding to local resources;
modules and abstract types with abstraction boundaries that are respected by interaction;
global names, generated either freshly or based on module hashes: at the type level, as runtime names for abstract types; and at the term level, as channel names and other interaction handles;
versions and version constraints, integrated with type identity;
local concurrency and thread thunkification; and
second-order polymorphism with a namecase construct.

Bullets two and five are all about type-safe hot code swapping with (potentially) changing behavior. Acute has been discussed on LtU relatively extensively.

By Paul Snively at Sun, 2006-12-10 16:04 | login or register to post comments

unbound -> late binding

I'm just guessing I'm on topic, based on seeing hot code replacing right before a question about doesNotUnderstand. In the context of runtime exceptions, you can try to fix problems in-place rather than doing a non-local exit. When no method is found, you can represent this explicitly by dispatching doesNotUnderstand, which might fix the fault.

Paul Snively: I'm having difficulty seeing "doesNotUnderstand" as a benefit. Could you perhaps explain what you mean here?

For example, I used doesNotUnderstand at Taligent for lazy library loading. When I had a class Foo implemented in library Bar, I needed only define a stub for Foo with not much more than doesNotUnderstand defined, which loaded library Bar and then re-dispatched with the original message.

(No, we didn't use Smalltalk at Taligent. One of my several personal projects overlapping with uses at work included a total rewrite of XLisp that added things I wanted from Scheme and from Smalltalk, so it had a bluebook style object system. Because dynamic languages were essentially forbidden at Taligent by the static religion, I used this to write ad hoc scripts for driving things in unit tests.)

By Rys McCusker at Sun, 2006-12-10 17:55 | login or register to post comments

doesNotUnderstand

I'm having difficulty seeing "doesNotUnderstand" as a benefit. Could you perhaps explain what you mean here?

Something in the lines of doesNotUnderstand (e.g. method_missing in Ruby) gives a chance to an object decorate other objects with some generic behavior without having to understand the original call context. In the Ruby on rails example method_missing was used to execute methods that mimic select queries (e.g. Person.find_by_user_name_and_password(user_name, password)) without having to predefine them. This particular example could be statically typed by using something like HaskellDB, but the nice thing about such hooks is that you don't need to switch to another language or introduce new features for your library. Something similar is used for both XML and HTTP request processing. While the results in type systems are truly exciting, we still don't have an statically typeable solution for those programming patterns.

As for "hot code replacing with changing behavior" not being typable, I guess I haven't ranted about Acute enough recently:

Acute is really groundbreaking but it's features don't cover everything possible with hot code replacing in Erlang or Smalltalk. Sometimes, as Rys points, the code we are replacing is buggy and triggers a condition/exception which can correct the code or just wait for the programmer to fix it by starting the debugger and halting the execution (which I think is an awful solution to this problem) or wait for someone to issue a code change request to fix the issue which may change the internal types used without interrupting the server and restarting a transaction. The Erlang movie shows an example of this. AFAICT you can do such in Acute up to certain limits, which goes back to my point above. The nice thing about hot-code replacing in dynamic languages is that it works right out of the box, you don't need to change the language or introduce new features. If I wrote my highly concurrent server using Haskell and STM I would not be able to use Acute's features because the type systems are different. Until we have a comprehensive type system that deals with all those issues we can't say that the problem is solved. Right know we have to say: well for this problem you use Haskell, this you solve in XDuce, for that one you need Acute, and pay no mind to the man behind the curtains making all these languages work together without having runtime or programming time hidden costs.

_{That said Acute is wonderful and it's an argument I use to show that type systems can solve real world problems without requiring a PhD or that you are Oleg.}

By Daniel Yokomizo at Sun, 2006-12-10 20:44 | login or register to post comments

You Know My Response...

...will be "sure, anything's possible when you work without a safety net." ;-)

My only point is to say that hot code swapping working "out of the box" is only true to the extent that the new code doesn't violate any of the invariants of the old code, begging the question as to how you avoid it.

Your point that we don't yet have all of the features we'd like in a single type system implemented in a single, or at least a single family of, languages is certainly well-taken: it's ground that Anton van Straaten and I have covered rather extensively before. Given your reply, though, I think I misunderstood your earlier point, as it sounded at the time like another claim that dynamic typing can do things that static typing can't—a claim that will always get a response from me. :-) However, I see now that you were merely pointing out that, absent a unified type system/language with all of the protections we might desire, dynamic languages tend to support these things more broadly, albeit with few or no compile-time guarantees. I agree—I understand Alan Kay to hold what I consider to be similar views (basically, until computer science evolves into an actual engineering discipline, use late binding to shield your systems from the inevitable changes), and I'm broadly sympathetic to this view. Thanks for the patient elucidation. :-)

By Paul Snively at Mon, 2006-12-11 00:03 | login or register to post comments

As you may know mine.

That was the exact point I was trying to make (to you). Not that I care about this issue, I'm more than happy using Haskell for pretty much everything outside my day job. Saying that, one thing I learned from LtU (reading posts from people like Marc, Anton, Frank and you) is that there are people that don't think like me and don't want to play nice with the compiler (by encoding everything in a way that the type system accepts it). Also there are entire classes of programming idioms that are statically typeable but there's no unique type system for them.

It's important that static typing afficionados do understand the features from dynamic languages so that we can design a type system that types these (or similar) capabilities. In my original post I wanted to refute an assertion that I heard several times (including from me): that a simple type system is sufficient to type all interesting and useful features of dynamic languages. It isn't true today and if we don't acknowledge this it won't be true tomorrow.

By Daniel Yokomizo at Wed, 2006-12-13 02:18 | login or register to post comments

Nothing to Add Here...

...but amen, on all counts.

By Paul Snively at Wed, 2006-12-13 09:40 | login or register to post comments

Only if you ignore unit testing.

"the difference is that type-checking at compile time results in errors being caught by the programmer rather than the user. [...] the big one is that your user never sees whole classes of errors."

Only if you ignore unit testing. Unit tests are meant to catch different things than type-checking, true, but proper unit testing in properly designed object-oriented programs will prevent type errors from occuring just as well as explicit type-checking will.

This is actually about static typing versus dynamic typing, which is a whole different ball game that has been argued time and time again. I personally believe that both have their uses, but hold that dynamic typing does not necessarily imply that bugs are more difficult to discover or squash.

By Jake M at Fri, 2006-08-25 18:21 | login or register to post comments

No.

[...] but proper unit testing in properly designed object-oriented programs will prevent type errors from occuring just as well as explicit type-checking will.

No matter how many "properly" qualifiers you use, no amount of unit testing can ever prove the absence of type errors. Whether you/your customers can live with that is up to you, but your statement is patently false.

By Bárður Árantsson at Fri, 2006-08-25 18:40 | login or register to post comments

You misread my statement.

Likewise, no amount of weak compile-time type-checking can prove the absence of type errors. I didn't say unit testing would prevent all type errors, just that it would prevent as many as compile-time checking would; the nice thing is that it catches so many other kinds of errors as well. Try that with any amount of static type checking.

By Jake M at Sun, 2006-08-27 18:05 | login or register to post comments

Narrowing In

Jake McArthur: Likewise, no amount of weak compile-time type-checking can prove the absence of type errors.

This would seem to be a tautology.

Jake McArthur: I didn't say unit testing would prevent all type errors, just that it would prevent as many as compile-time checking would...

The only problem with this is that it's trivially false; see my other reply to you for a specific counterexample.

Jake McArthur: ...the nice thing is that it catches so many other kinds of errors as well.

Practically speaking, this is correct; a program typically does rely on having some dynamic semantics, most functions are partial rather than total, and you want to ensure that your user doesn't encounter unexpected, let alone undefined, behavior. But as I've recently written elsewhere, static typing and testing are in a dual relationship to each other; an expressive strong static type system can prove the absence of undesirable properties, which testing can't do, whereas sufficiently careful testing can demonstrate to an acceptable level of probability that the portion of the domain not addressed by the type system is irrelevant to the successful execution of the program. The whole question becomes a lot more interesting in systems with dependent types, because they blur the phase distinction slightly, but not entirely, often at the "cost" of making type inferencing undecidable in the general case. See my reply elsewhere to Kay Schluehr for links to a couple of excellent launching-off points to learn more about dependent typing and how it greatly aids proof of correctness of software at compile time.

Jake McArthur: Try that with any amount of static type checking.

Again, assuming a program that takes dynamic input and/or in which not all functions are total, static typing can prove things that unit tests cannot, while unit tests can demonstrate likely correct behavior of the dynamic semantics of the program. They're each capable of doing things that the other is not. Where we tend to get hung up is at the intersection of the sets of things they can do, which is necessarily informed by our assumptions about the type system and dynamic semantics of languages X, Y, Z... that we have in mind when we make our claims. As I wrote in my other reply to you, it's important to be aware that we're not typically talking about C, C++, Java, C#... here (we'll usually say explicitly when we are). We tend to discuss vastly more expressively powerful languages than that, and that makes all the difference in the world to the claims we make and why we make them.

By Paul Snively at Sun, 2006-08-27 18:46 | login or register to post comments

For many functions units

For many functions units tests can't even show that a function's total where a static type system could.

By Philippa Cowderoy at Mon, 2006-08-28 20:10 | login or register to post comments

I won't repeat all

of what Paul Snively has already said, but I feel I should point out that I'm specifically not saying that unit tests are useless, far from it. They are very useful for testing properties which cannot be expressed in the type system of the language you're working in.

By Bárður Árantsson at Sun, 2006-08-27 19:42 | login or register to post comments

I think it's a little unfair

I think it's a little unfair to say someone else misread your statement when you're now including an additional qualifier ("weak") in it.

And really, you don't seem to know what you're talking about when it comes to the classes of errors that can be caught with a sufficiently powerful static type system. Encoding data type invariants in the type system is a standard trick for some of us, for example.

By Philippa Cowderoy at Mon, 2006-08-28 20:07 | login or register to post comments

Time for me to back down.

"And really, you don't seem to know what you're talking about when it comes to the classes of errors that can be caught with a sufficiently powerful static type system."

After reading over some replies, I must reluctantly agree to this. I'm resisting the urge to defend myself, as I know I will probably just come off as looking like an idiot. :)

By Jake M at Tue, 2006-08-29 03:35 | login or register to post comments

To Be Fair...

...we've all said things that have turned out to be beyond our ken before; there's nothing wrong with that. All that I think has happened here is that you might've wanted to qualify some of your points about strong vs. weak static typing earlier, because, really, there's nothing controversial about pointing out that C and C++ are weakly typed, and when it comes to the expressiveness of their type systems, they're also lacking.

So now that we've established that (I hope!) it occurs to me that you might be in for some relatively pleasant surprises, if we haven't hounded you away already. I personally like Objective Caml a lot. It's a strict (all parameters to a function are evaluated before the function is called, as opposed to "non-strict" or "lazy," in which terms are evaluated only when actually used), impure (meaning that it supports unconstrained variable mutation, as opposed to "pure," which does not), functional (functions are "first-class," meaning that they can be taken as parameters, returned as results, stored in variables, etc. etc. etc. just like anything else can) language that also happens to have a pretty traditional class-based object system with multiple inheritance, etc. very much like C++ in its semantics. Developing Applications with Objective Caml is a great way to start learning it, but if that's a bit too basic, Jason Hickey's tutorial might be a faster route to entry. In any case, it's a very popular dialect of the ML family. The ML family in general emphasizes parametric polymorphism ("Generics," e.g. templates in C++ or generics in Java 5), the use of algebraic data types ("disjoint unions" or "sum types" or "variant types," depending upon who you talk to) and pattern matching on them, and powerful module systems.

A lot of folks here like Haskell a lot. Haskell can also very loosely be considered a member of the ML family inasmuch as it also emphasizes parametric polymorphism, algebraic data types, and pattern matching. However, Haskell is lazy, pure, lacks an object system (as a built-in feature), and has "typeclasses" vs. an ML-style module system (but note that typeclasses and modules have a lot of overlap). Haskell is also consciously even more language-research-oriented than the other MLs; major concepts in functional programming, such as Software Transactional Memory, tend to show up in the GHC implementation of Haskell before they do anywhere else. Conversely, Oleg Kiselyov seems dedicated to demonstrating repeatedly all of the ways in which Scheme, ML, and Haskell are kissing cousins, frequently by doing things that are "impossible" in one language or the other. As far as learning Haskell goes, many people swear by The Haskell School of Expression, but you should ask the Haskellers here about that.

In any case, I do hope that all of our chiding hasn't put you off LtU, and that you find some of the things that others of us have spent some time learning to be of interest to you as well. Please don't hesitate to ask questions or offer further comments, as we're all here to learn from each other (as I am reminded to my chagrin whenever I say something in error)! :-)

Update: I neglected to add that the best resources I'm aware of at the moment for learning about the breadth of what static typing can do remain Types and Programming Languages, often referred to here simply as "TAPL," and Advanced Topics in Types and Programming Languages, or "ATTAPL." Also, be aware that those of us who've been using an ML or Haskell for a while frequently employ "tricks of the trade," such as relying on abstract types having existential quantification, so-called "phantom types", etc. that are perhaps best described as "underdocumented"—they're very much part of the lore of expressive static typing, but I honestly don't quite know where to point you to learn enough about them to have some confidence of giving you that "aha!" moment where it all fits together. Suffice it to say for the moment that those of us who seem looniest about static typing, like myself, have this kind of "typeful programming" very much in mind in everything we say on the subject. Oleg Kiseleyov's work, linked to above, tends very strongly to rely on this kind of advanced understanding of the type systems of ML and Haskell—certainly far more advanced than my own—and how they relate to the untyped lambda calculus, hence Scheme.

Finally, a handful or two of us are interested in the most formal foundations of type theory, programming languages, and computing, so we spend some time with what are called "theorem provers," which come in basically two major flavors: automated, and what are called either "interactive theorem provers" or "proof assistants." Particularly popular examples in the context of programming language research include Twelf and Coq. The connection to programming languages lies along both pragmatic and theoretical dimensions: pragmatically, the ML family exists because the first ML was a scripting language for an early theorem prover, LCF. Theoretically, you'll hear a lot on LtU about the "Curry-Howard Isomorphism" or "Curry-Howard Correspondence," which I prefer to summarize as "propositions as types," but see the excellent Wikipedia article for many more details. The astonishing thing is that there is an isomorphism between types and theorems in intuitionistic logic, and programs and proofs! So with a theorem prover based on a sufficiently expressive type theory, as both Twelf and Coq are, you can develop proofs about algorithms and data structures, and treat the theorems as types and the proof as implementations of those algorithms and data structures. The connection is so strong that Coq, for example, supports the automatic extraction of a module or program from proofs to Scheme, Objective Caml, or Haskell! Because the type system of Coq is so vastly more expressive than those of Objective Caml or Haskell, though, the resulting code sometimes relies on constructs that violate type safety in those systems (Obj.magic in Objective Caml; Unsafeperform... in Haskell). Note that as cool as this is, it's also impractical for any but the smallest or most important kinds of code: think man-rated code where, if it fails, someone dies. On the small side, though, important pieces of, e.g. compilers for other languages have been proven and sometimes even extracted in this fashion.

I hope that's enough by way of introduction, but again, please do jump in and ask/assert anything else you wish to.

By Paul Snively at Tue, 2006-08-29 06:57 | login or register to post comments

Scheme and ML

I can affirm that learning Alice ML has changed the way I understand and use Scheme.

By Webley Webster at Tue, 2006-12-05 19:30 | login or register to post comments

Thank you very much for

Thank you very much for taking the time to write up that very nice introduction to these concepts. I had already begun studying these independently long before I read this (even before I made the uneducated assertions of this thread earlier, but I will protect my dignity by saying that I merely didn't understand the implications of what I had learned up to then), but I will certainly check out all the links you gave me. Sorry that I didn't see this comment before, but I have been quite busy lately and haven't had the time for my academic hobbies.

I wasn't put off by anybody here. In fact, the very strong opposition to what I was saying was a very strong clue that I was not approaching this in the way of the typical LtUer, so I consider it a learning experience.

By Jake M at Tue, 2006-12-05 21:39 | login or register to post comments

Welcome Back!

Hi Jake! Welcome back. I hope you're enjoying your holiday season, assuming that that applies to your part of the world. :-)

You're quite welcome for the brief write-up. I'm gratified to hear that it's proving useful to you. Your return has provided me the opportunity to review it, which I hadn't done in some time, and to reflect a bit upon the need for folks like myself to be crystal clear from the outset, when posting here, that most programmers' experience of static typing comes from the C/C++/Java/C# tradition, and so doesn't accurately reflect what we experience when we work in O'Caml or Haskell, for example. It's far too easy for me to go off about static typing, forgetting that I no doubt sound like a madman to most other programmers!

Thankfully, LtU now boasts an excellent Getting Started thread to help elucidate some of the foundations of the conversations that tend to take place here. There are also some new site policies and procedures that I believe are helping to modulate some of the more, um, passionate kinds of conversations that have taken place here (and intellectual honesty compels me to confess that I have been a frequent guilty contributing party to most of them). In summary, I believe and hope that you'll find LtU a congenial place in which to collaboratively explore the ramifications of various programming language designs, proffer ideas and questions, receive questions and challenges, and generally have a lot of fun doing so. Finally, thanks for your extremely generous closing—I can only hope that I respond to feedback with the same level of grace and patience when I grow up.

By Paul Snively at Wed, 2006-12-06 16:01 | login or register to post comments

For Finite Programs

For finite programs, you can write enough tests.

For a given function and any given program size, there is some finite set of testcases that catch all programs of that size or shorter which fail to compute the desired function on all inputes. (leaving aside concurrency and IO for the moment.) Knowing that a program is correct seems to subsume most useful forms of type correctness, except perhaps those having to do with readability and maintainability.

It's obvious that there is some complete test suite with at most one
example for each program satisfying the length bound, but I don't know if you can construct such a test suite without an oracle for the halting problem.

Seeing that exponentials and oracles are even an issue, if you actually care about your program being correct, it's looking more attractive to try to argue that directly rather than to show that it passes some test suite, and try to argue that passing the test suite is enough to tell you that it is correct.

By Brandon Moore at Sun, 2006-08-27 20:14 | login or register to post comments

I may be misreading you,

but finite programs don't necessarily run in finite time. That is, if your type system cannot impose/prove termination, you end up in a situation where the pass/fail status of any particular test is only semi-decidable. For any practical testing system you'd end up with results Pass, Fail or Did Not Finish. Things only get fuzzier from there... ;)

By Bárður Árantsson at Mon, 2006-08-28 20:22 | login or register to post comments

Termination

Right. I should have said that there is some test suite which only correct programs can pass - at least as long as we only consider specifications where "correct" implies "total". (I don't see how to make a complete test suite when a correct program is supposed to diverge on some inputs).

This would seem to be enough unit tests to prove that a program has no type errors. On the other hand, I hope the impracticality encourages people to think twice before claiming that tests are "as good as" types, without so much as specifying what type systems they are talking about.

By Brandon Moore at Tue, 2006-08-29 06:22 | login or register to post comments

Here we are again

Paul, lets assume you have the ultimate type checker, a proof-oracle that breaks the Turing barrier and checks the correctness of any program in arbitrary depth in the ultimate declarative language at compile time but the only output you get from it, in error case, is the error message and some awkward traceback. Here I'm getting stuck, because I can't even debug the whole mess since it doesn't compile! Instead of stepping into the code and being guided by runtime-typechecks the only chance to get the program correct is doing a code review. Conclusion: giving ever more power to a godlike type-oracle does depower the programmer in the end and gives him close to nothing. Do you know the tale from the fisherman and his wife? The danger I see with your tireless rhethorics is that you are turning a tool into a festish.

By Kay Schluehr at Sat, 2006-08-26 06:30 | login or register to post comments

If our ultimate language is

If our ultimate language is suitably compositional it's actually pretty easy to break it apart into small pieces until we find the one containing the error. Heck, mere modularity often provides a good starting point. It's even possible to write tools that do this for you, perhaps under a little guidance, although making them scale to large problems is, I believe, still an ongoing effort.

By Philippa Cowderoy at Sat, 2006-08-26 08:58 | login or register to post comments

Reductio ad Absurdum

I certainly can't respond in any meaningful way to "let's assume you have... a proof-oracle that breaks the Turing barrier," especially when I've not only explicitly disclaimed that, but I've also explicitly disclaimed the possibility except in extremely uninteresting cases of having all computation happen at compile time; see here for a specific example. Finally, I've never said anything even remotely like "it would be desirable for all computation to happen at compile time."

So you start with a faulty (impossible, in fact) premise that certainly didn't come from me, and use it to reach a faulty conclusion ("giving ever more power to a godlike type-oracle does depower the programmer in the end and gives him close to nothing.") Frankly, it's bizarre on its face; how does a compile-time error, complete with line and character-range information, take power from you? Unless, of course, you're used to working in a language in which the only "compile-time" errors are syntax errors and any other kind of compile-time error seems baffling! :-)

Kay Schluehr: The danger I see with your tireless rhethorics is that you are turning a tool into a festish.

You say that like it's a bad thing.

Unfortunately, what you're ignoring is Jake McArthur's post that I responded to, and two others from Jake, which Philippa Cowderoy and BÃ¡rÃ°ur Ãrantsson responded to. I wonder why it is that you don't observe that Jake is turning a tool (unit testing, in this case) into a fetish? It's a much stronger claim, given that Jake is making incorrect statements and I'm not. The tirelessness that you observe stems precisely from my unwillingness to let simple falsehoods slide. Complex falsehoods... well, those I call "science," which I then attempt to advance the state of.

Update: I went back and re-read what I wrote that you're responding to, and I can't help but notice that you seem to have overlooked my "In the currently popular statically-typed languages, this benefit is mostly trivial (IMHO)..." and "Again, how significant this is is highly dependent upon your application domain, as well as the power of your tools." So perhaps you could help me by explaining what level of qualification would suffice. :-)

By Paul Snively at Sat, 2006-08-26 13:07 | login or register to post comments

tool obsession or dedication

Paul Snively: You say that like it's a bad thing.

I got a big kick out of seeing this line again. It was one of my favorite jokes that Andrew Shebanow introduced. I had the impression it must have been a running joke in Apple DTS. Whenever I'd rant about something bothering me, Andrew would interrupt to say, "You say that like it's a bad thing." So, thanks for the flashback. :-)

Hey, I'm actually going to say something substantive, that might be on topic if you squint a little. I started a new job two weeks ago (company name intentionally left blank), and already I've started fielding queries that sound like this: "What do you know about fixing random crashing bugs?" So yesterday I finally gave in and started lecturing folks on relevant topics, like entropy and how large systems become chaotic.

Mainly I focused on a fire fighting metaphor: putting firebreaks where you hope to contain a conflagration. Essentially I told them one must add code and spend cycles to detect a transition from well-behaved code (doing the expected) to shades-of-gray code (doing something ambiguous), and then complain about any and all such transitions. I tried to claim that failure states tend to be reached by walking through the gray zone first.

In essence I was dissuading them from the belief that if code has not yet crashed, that everything must be perfectly fine. I told them, no, there's lots of running states of marginal value which, though not outright incorrect, enable a later transition to failed invariants causing failure.

The problem with type checking systems often seems to be a focus on a clean binary distinction between okay (perfect) and not okay (broken). There are also permitted states in typed systems you also don't want to enter because they inhabit gray zones in the thinking of coders, where pre and post conditions are not very well thought out (and hence should be shunned). In fact, the word "shun" seems an interesting way to tackle gray zone logic.

This is me, signing off from left field without a conclusion.

By Rys McCusker at Sun, 2006-08-27 04:19 | login or register to post comments

Challenging Myself to a Dual

Rys David McCusker: I got a big kick out of seeing this line again. It was one of my favorite jokes that Andrew Shebanow introduced. I had the impression it must have been a running joke in Apple DTS. Whenever I'd rant about something bothering me, Andrew would interrupt to say, "You say that like it's a bad thing." So, thanks for the flashback. :-)

Good times!

I definitely remember Andrew and Keith Rollin using the line a lot in my MacDTS days. DTS-culture-specific or not, it accomplishes what I intended in using it, which is to remind people that some apparently (and sometimes truly) wild differences in focus are easily explained as people being consistent about following their values, the point being that those values might be different from yours. People who haven't known me for more than about the past three years might not realize that I come from an extensive dynamic language background and therefore assume that my championing of static typing is likely to be based on ignorance. Sometimes I feel it's worth disabusing people of this notion, sometimes I don't, largely depending upon my own perception of my interlocutor's sincerity in being willing to have their own preconceptions challenged (a fact that you'll no doubt recall well from some highly-charged discussions on Hack the Planet).

I'm also guilty of writing counter to my audience. In this context, I encounter a fair number of folks who simply—sorry—don't have any idea what static typing can and can't do, so I preach static typing to them, hopefully with enough information that it isn't merely dogma. But put me in a context where you've got some theorem-proving nut who can't be bothered to write any tests because "I've proven this algorithm correct" and I'll start quoting Knuth ("I have merely proven this code correct; I haven't actually tried it") at them, talking about the probable lack of domain analysis in their proof, etc. Interestingly (and this may be pure self-selection), I encounter dramatically less of the latter than the former. Regardless of my sense of the statistics, I really should be more consistent about reinforcing the observation that static typing and testing are dual to each other: static typing proves, by whatever logic the type system is isomorphic to, the absence of properties before crossing an important staging distinction, whereas testing modulo adequate coverage proves, by demonstration, the presence of properties. Which leads me nicely to...

Rys David McCusker: In essence I was dissuading them from the belief that if code has not yet crashed, that everything must be perfectly fine. I told them, no, there's lots of running states of marginal value which, though not outright incorrect, enable a later transition to failed invariants causing failure.

Exactly. Lately there seem to be some interesting tools to help identify these hidden invariants. One that I'm looking at hard lately is Daikon. I'm particularly interested in the intersection of Daikon and the Java PathFinder as useful tools for helping make existing Java code correct ex post facto. But I'm also interested in, e.g. Krakatoa, Why, and Coq to help us ensure that we know what we're doing before and during the development process. I call it Malachi Crunch development: I want to tackle defects with the best available tools from both directions, compile time and runtime.

Rys David McCusker: The problem with type checking systems often seems to be a focus on a clean binary distinction between okay (perfect) and not okay (broken). There are also permitted states in typed systems you also don't want to enter because they inhabit gray zones in the thinking of coders, where pre and post conditions are not very well thought out (and hence should be shunned). In fact, the word "shun" seems an interesting way to tackle gray zone logic.

I certainly agree that the overwhelming majority of type systems don't allow the programmer to craft anything near adequate specifications to suggest that all possible terms with the types in question are, in fact, handled by the implementation associated with those types. Coq'Art makes this point extremely strongly in the process of motivating dependent types, using the example of "a function that maps any natural number n, such that n>1, to the smallest prime divisor of n," which in a non-dependent type system gets badly underspecified as "nat->nat."

So popular type systems have "holes" with respect to narrowing the range of possible terms inhabiting the type, while testing has "holes" with respect to code coverage—ensuring that all possible terms are accounted for by the code. Obviously, this is an oversimplification, particularly insofar as it ignores temporal issues. Nevertheless, there's once again a suggestion of a dual relationship here that I think can be exploited more than it tends to be with either a focus on static typing or on dynamic typing.

It would be interesting, at least as an academic exercise, to attempt to develop a compiler for a language in which the types of terms were part of a Bayesian belief network. Type inference, then, would consist of belief updating. The representation should persist into runtime, and there should probably be some construct for setting a threshold of probability, with operations driving the belief of the type of the term(s) being operated on dropping below would result in an exception being thrown, be that at compile time or runtime. Hmmm. I guess you immediately run into the Buridan's Ass problem. Given:

x = "Hello!";
x = 42;

What is the type of x? It's .5 that it's a string and .5 that it's an integer. Of course, you can argue that if that's the whole program, it's 0 that it's a string and 0 that it's an integer, because x isn't observable. I guess that gets us into "subjective" vs. "objective" probability, etc. Ugh.

I dunno. I keep coming back to even poorly-thought-out pre- and post-conditions and invariants as having value, if only because today's theorem provers and model checkers will tell you, if nothing else, that you're being inconsistent. One thing that I love about the Alloy tutorial, for example, is that it uses a very simple hierarchical file system as its pedagogical vehicle. The immediate temptation is to say "A hierarchical file system; how hard can that be to specify correctly?" The answer turns out to be: surprisingly subtle, and even the resulting single-page Alloy model turns out to be extremely helpful.

Having said that, I'd still want tests for the actual implementation. :-)

By Paul Snively at Sun, 2006-08-27 07:12 | login or register to post comments

Mea Culpa

Ehud Lamm: for Paul: Playing Devil's advocate, are we? :-)

It is also true in those dark times, a wise man had to believe things that were in contradiction among themselves.
— Umberto Eco, "The Name of the Rose"

Ehud Lamm: Paul (again): One that I'm looking at hard lately is. So hard that you haven't mentioned it on the homepage? Shame on you ;-)

Well, in my defense (yes, I see the wink!), "lately" here really means "yesterday." There are several related (in my mind, anyway) bits, so let me try to craft a good story out of them. :-)

By Paul Snively at Sun, 2006-08-27 19:51 | login or register to post comments

wise man had to believe

wise man had to believe things that were in contradiction among themselves.

That's actually pretty good advice to anyone trying to work in our field.

(No, there's no wink at the end of that sentence ;-)

By Ehud Lamm at Sun, 2006-08-27 20:13 | login or register to post comments

marginal utility of detecting excluded middle

I hope this isn't one of those posts that make me seem strange. In the 70's I was a rare youngster interested in the law of the excluded middle (below, let LEM mean this, and let EM mean excluded middle) and criticisms thereof (cf intuitionist logic). Folks looked at me funny when I talked about such things back then, so I stopped. :-)

Ehud Lamm:
Defensive programming seems to be what you are talking about.

That's very close -- so close I don't want to quibble. (If I meant to quibble, I might say choice of terminology depends on what you're trying to reason about, and defense isn't a perfect fit in this case.) It's best to consider me a clever practitioner with a pragmatic interest in maximizing utility and minimizing costs.

I guess a lot of basic economic reasoning tends to inform my seldom-voiced assumptions about what's worth attention in daily work problems. Almost all the time I'm concerned with incremental increase in software value and the incremental decrease in costs. In economic jargon, it helps to know ideas like marginal cost, marginal utility,and opportunity cost.

(Here's a short crib note for folks who don't care about economic jargon: those are all basically calculus of value ideas concerned with incremental changes in cost and benefits.)

Okay, so in the real world (I mean in work you get in Silicon Valley -- I realize this isn't actually the real world :-), one finds marginal utility in software tends to decrease as you add code. The old code interferes with new code, and reduces the value of new stuff added. Lots of things cause the interference, and you can read all kinds of literature on this. But a cause I think about often, which is seldom noted, concerns the excluded middle (once again, LEM means law of the excluded middle).

The idea behind the law of the excluded middle is that a proposition P should be either true or false, exactly, and not have some fuzzy probability value between zero and one. For example, P might be "this code works perfectly" and not-P might be "this code is broken". Anyway, LEM says the excluded middle (EM) is zero sized, and that's what is meant by "excluded".

But in software, the EM (excluded middle) is not zero sized, and it tends to increase as you add more code. So the marginal EM of software tends to increase. (I know this sounds a bit weird, no?) It's caused by propositions that kinda-sorta apply, and code that kinda-sorta works with data when the data reaches allowed extremes.

That's a slightly more scientific way of saying "software gets chaotic", but now there's something to latch onto, since I'm offering causes of chaos that can be attacked by specific technique: go search and destroy EM if you can test for it. The EM is caused by lack of fit between real work data and usage patterns and the taxonomies (aka classification schemes) intended by the code.

Okay, now back to my gratuitous nod to economics, with the addition of one more term: diminshing returns. As you add defensive programming, the returns diminish, which means the marginal value of adding more defensive programming decreases.

But if you don't have any defensive programming at all, which is not all that uncommon for lots of software in the wild, then adding a little can have a large payoff to begin with. That is, the marginal cost of writing a little defensive code is small, but decrease in marginal cost from failed software is large. Coders usually call this sort of thing "low hanging fruit" (big bang for the buck), but only provided they conceive of the opportunity. Folks who believe the LEM is true all the time have trouble seeing the point.

By Rys McCusker at Mon, 2006-08-28 05:31 | login or register to post comments

I really should be more

I really should be more consistent about reinforcing the observation that static typing and testing are dual to each other

For a practical demonstration of this, look no further than the homepage, and the paper demonstrating the use of the JML runtime assertion compiler and associated testing to uncover design flaws that ESC/Java2 static checks could not (the converse, that ESC/Java2 can uncover design flaws that runtime checking would be all but useless to find, is of course clear). The beauty here, of course, is that you have one set of annotations that can significantly strengthen both the static and runtime checking simultaneously.

By Leland McInnes at Sun, 2006-08-27 22:18 | login or register to post comments

Final fantasy

So you start with a faulty (impossible, in fact) premise that certainly didn't come from me, and use it to reach a faulty conclusion ("giving ever more power to a godlike type-oracle does depower the programmer in the end and gives him close to nothing.") Frankly, it's bizarre on its face; how does a compile-time error, complete with line and character-range information, take power from you?

The "oracle" is a metaphor not a fact or reality. It represents a certain relationship between knowing a fact and not knowing about causation. The closest approximation in practice is the information that a black box tester offers to a programmer about an error occurred in his test.

As long as type systems are "trivial" ( as you say ) I don't see any practical problem with the compile-and-fix cycle. On the contrary it works fine in my own programming practice. But once you push the compile-time type checking fantasy to its end ( which is of course an "unreasonable" thought experiment for its exaggaration ) it becomes crucial to have running code despite the fact that it contains (un-)detected errors simply because you need to debug it and find the real error cause which may differ from the causation of the type-check failure which could be a mere epiphenomenon. Hence the oracle must be somehow decoupled from the compilation. You need compiled code anyway and have to rethink the current type-check-first strategy which also provides optimization hints to the compiler. But once you abandone the type-check-first strategy which decides about compilation the best thing you can get is compiled code with runtime type checks. Ironically the type-check-late strategy leads to a very different crop of languages and opens the static/dynamic language gap. I like to repeat myself too: here we are again.

By Kay Schluehr at Sun, 2006-08-27 07:19 | login or register to post comments

Objection, Your Honor

Kay Schluehr: But once you push the compile-time type checking fantasy to its end ( which is of course an "unreasonable" thought experiment for its exaggaration ) it becomes crucial to have running code despite the fact that it contains (un-)detected errors simply because you need to debug it and find the real error cause which may differ from the causation of the type-check failure which could be a mere epiphenomenon.

This is the part that I'm afraid I have to insist is erroneous: when you have a richer type system than is currently popular—that is, the farther you get from the origin of the Barendregt Cube—you don't have more spurious type errors; you have fewer of them, because the types get more and more specific. The weaker the type system, the more is left to the dynamic semantics of the language simply because it has to be, so the more you have to test. In the limit, you have the Calculus of Constructions or "better" (e.g. a Pure Type System). I put "better" in scare quotes because the Calculus of Constructions is already Turing-complete; Pure Type Systems etc. are more general but can't be any more expressively powerful.

So the bottom line seems to be that if your type system is the Calculus of Constructions or better, your types can be arbitrarily accurate, up to divergence (you still can't compute the uncomputable). If all of the functions in your system are total, there's nothing left to worry about. Of course, this rarely obtains in the real world: we tend to have a fair number of partial functions; parsing dynamic input into types the program deals with can be problematic; we haven't discussed terminating vs. non-terminating programs, and so on. Current type theory research revolves precisely around these issues and real progress is being made. I highly recommend Epigram: Practical Programming With Dependent Types and Lightweight Static Capabilities if you haven't already read them. They do an excellent job of motivating the use of rich type systems to make strong behavioral guarantees about programs. In particular, both address the issue of presenting abstract data types, all of whose supported functions are total.

Kay Schluehr: Hence the oracle must be somehow decoupled from the compilation. You need compiled code anyway and have to rethink the current type-check-first strategy which also provides optimization hints to the compiler. But once you abandone the type-check-first strategy which decides about compilation the best thing you can get is compiled code with runtime type checks.

The entirety of this is the flawed conclusion based on the flawed premise, unfortunately.

Kay Schluehr: Ironically the type-check-late strategy leads to a very different crop of languages and opens the static/dynamic language gap. I like to repeat myself too: here we are again.

And as long as people continue to create and promote dynamically-typed languages because they fail to accurately understand static typing, the longer we'll be here. Note that this doesn't mean that there aren't good reasons to create and promote dynamically-typed languages; there are. It's just that none of them have been successfully articulated here.

By Paul Snively at Sun, 2006-08-27 18:14 | login or register to post comments

The compiler is an interpreter is a compiler

Why can't the compiler let you debug your types?

By Daniel Yokomizo at Sat, 2006-08-26 13:40 | login or register to post comments

NO

Static type systems CAN NOT AND WILL NOT catch bogis answers. lets take this code for instance lets say you want to double a number and you type + instead of *. a static type system will not help you here

Haskell

foo :: (Num a) => a -> a
foo bar = bar+2

By atbusbook at Mon, 2006-12-04 17:50 | login or register to post comments

That's not a Type error

It is a programming error. Static type systems are capable of creating correctly type programs. They are not capable of verifying general correctness.

That said, the static vs. dynamic debate has a long history on LtU. Most LtU participants are tired of rehashing the same arguments, so you're probably not going to get much much of a response. Personally, I think if you are really enamoured of the advantages of dynamic programming, then Scheme and Lisp can't be beat. But then I also think that Scheme probably shares more in common with ML and Haskell (some of the most static languages in the world) than it does with languages like Python and Ruby. So the classification of static vs. dynamic is not as simple as one would think.

By Chris Rathman at Mon, 2006-12-04 18:10 | login or register to post comments

You need a fancier type for foo...

Here's one way to get the type error you desire...


{-# OPTIONS -fglasgow-exts #-}
{-# OPTIONS -fallow-undecidable-instances #-}

class    Foo a    where foo :: a -> Even
instance Foo Even where foo bar = bar *. (2::Even)
instance Foo Odd  where foo bar = bar *. (2::Even)

--type error...
--instance Foo Even where foo bar = bar +. (2::Even)
--instance Foo Odd  where foo bar = bar +. (2::Even)

newtype Even = E Integer deriving (Eq,Show)
newtype Odd  = O Integer deriving (Eq,Show)

infixl 5 +.
infixl 6 *.

class Add a b c | a b -> c where (+.) :: a -> b -> c
class Mul a b c | a b -> c where (*.) :: a -> b -> c

instance Add Even Even Even where (E a) +. (E b) = E (a+b)
instance Mul Even Even Even where (E a) *. (E b) = E (a*b)

instance Add Even Odd  Odd  where (E a) +. (O b) = O (a+b)
instance Mul Even Odd  Even where (E a) *. (O b) = E (a*b)

instance Add Odd  Even Odd  where (O a) +. (E b) = O (a+b)
instance Mul Odd  Even Even where (O a) *. (E b) = E (a*b)

instance Add Odd  Odd  Even where (O a) +. (O b) = E (a+b)
instance Mul Odd  Odd  Odd  where (O a) *. (O b) = O (a*b)

instance Num Even where
    fromInteger x | even x = E x
                  | otherwise = error $ (show x) ++ " isn't even"
instance Num Odd where
    fromInteger x | odd x = O x
                  | otherwise = error $ (show x) ++ " isn't odd"

By Greg Buchholz at Mon, 2006-12-04 18:39 | login or register to post comments

That's the point.

That's the point. Static type systems give a FALSE sence of security.

By atbusbook at Mon, 2006-12-04 18:42 | login or register to post comments

By that logic....

...any warning or error message the compiler gives leads to a false sense of security?

By Chris Rathman at Mon, 2006-12-04 19:03 | login or register to post comments

we are responsible

By a similar argument someone might claim that testing programs gives a false sense of security.

We are responsible for understanding what testing programs and type checking can tell us about our programs, and acting accordingly.

By Isaac Gouy at Mon, 2006-12-04 19:17 | login or register to post comments

Typeing finds Only type errors

Testing can find all errors includeing logical errors, but static typeing finds only type errors.

By atbusbook at Mon, 2006-12-04 19:55 | login or register to post comments

New type checking for LtU

Perhaps we need to add a new static type checking regime for LtU posts.

For example, anonymous posters with a short posting history consisting entirely of opening old controversial threads and making inflammatory statements without any substance fail to type check for me.

I'd recommend a similar compiler setting for all serious LtUers.

By Marc Hamann at Mon, 2006-12-04 20:05 | login or register to post comments

Each of us should, of

Each of us should, of course, use his best judgement. Personally, I tend to follow Marc's approach unless I find a really good reason to respond.

By Ehud Lamm at Mon, 2006-12-04 20:17 | login or register to post comments

Yielding to temptation

Each of us should, of course, use his best judgement.

Though, it would seem based on the volume of responses generated that we are unlikely to let judgement get in the way of a tasty S vs. D thread. ;-)

By Marc Hamann at Mon, 2006-12-04 21:54 | login or register to post comments

reason-forcing conversations

Sometimes raking over old coals sparks a new understanding (mostly it doesn't).

By Isaac Gouy at Tue, 2006-12-05 17:34 | login or register to post comments

Apparenty testing incapable of catching spelling errors?

So how do you go about proving that testing finds all logical errors? Any formal mechanism of proof? Any objective empirical data to back up the claim?

Of course, it's quite feasible to use testing in all languages. Indeed, it should be encouraged.

By Chris Rathman at Mon, 2006-12-04 20:07 | login or register to post comments

evidence based approaches

Your second sentence got me thinking about my evidenced-based approach to coding practice. But as I composed a response, I noticed the topic of security dove-tailed right into the same human factors aspect.

Isaac Gouy: By a similar argument someone might claim that testing programs gives a false sense of security. [¶ (new para)] We are responsible for understanding what testing programs and type checking can tell us about our programs, and acting accordingly.

These days I work in a shop espousing a lightweight version of the "agile programming" creed, and we're pro testing and wish we had time to retrofit more tests. When I'm asked what I think of the "agile" process, I've started saying the same thing, which is: all the agile techniques aim to stop folks from shielding themselves from negative feedback. In other words, the aim is to 1) escape a false sense of security from evading input/feedback, and 2) to deal with negative data turning up from testing and criticism.

So agile programming is just an evidence-based process -- it aims to get more evidence sooner and deal with it. This approach helps defeat a natural human tendency to fantasy that everything's better than it is (if you can avoid evidence to the contrary). Okay, I'll get back to directly addressing words in the post above.

Both static typing and testing generate evidence on whether our assumptions (beliefs, expectations, hopes) about code are true. And we need to act accordingly after we get evidence. More of both (typing and testing) helps to fill the void of evidence encouraging false security. Neither seems adequate. I'll take testing over typing when I can get it, but typing saves me more time hacking old code.

I suspect some advocates of strong static typing harbor a false hope that typing alone can make everything fine, as if it generates enough evidence to permit a full sense of security without testing. Even a fabulous type system would require a lot of testing just to audit the type system, and with similar diminishing returns as everything else. (The more you test, the fewer decimal places of confidence you add because you tend to re-address old ground.)

Sorry if this post seemed to wander a bit.

By Rys McCusker at Mon, 2006-12-04 20:16 | login or register to post comments

pot meets kettle

I suspect some advocates of testing with static typing harbor a false hope that testing alone can make everything fine, as if it generates enough evidence to permit a full sense of security without static typing. ;)

Static typing and testing are complementary tools because they operate on different phases and have different goals. Ignoring testing is bad because almost every program has a runtime behavior that can't be verified statically, but ignoring static typing is worse because all almost every program has a static structure that can be verified but can't be tested.

If I have to choose between an inexpressive type system (e.g. Java before 1.5) and testing I would take testing every day. But nowadays the choice is expressive type system and testing or just testing. As typing can eliminate entire classes of errors. e.g. I can make a library that is provably safe against XSS or SQL injection by having safe types and providing formal proofs from Coq that the morphisms from user input to safe types do properly sanitize the data. If I have a value of type SafeString I can be sure that it can be safely embedded in SQL or HTML without further testing. It's impossible to do the same with testing. Ditto for side-effects, I still trying to see a testing based approach to verify that some code doesn't have side-effects if isn't supposed to.

Also you don't need to test things that the type-system checks for you so IME testing can be more comprehensive under expressive type systems. Testing doesn't provide such short-cuts to the type system, there's no way that a test suite can facilitate the work of a type checker.

I'm a big fan of testing and a proponent of TDD. I keep a large test suite and run it frequently, usually after finishing any new piece of code (i.e. up to 10 minutes since last run), but I don't think the state of the art testing can be more efficient in bug prevention/detection than the current state of the art type systems.

Even a fabulous type system would require a lot of testing just to audit the type system, and with similar diminishing returns as everything else.

Why would we want to test the type system instead of statically analyzing it? Type systems are smaller than most programs and much more amenable to be formally proved. Once the type system is verified to be correct (e.g. sound, doesn't stuck) it don't have to be tested.

By Daniel Yokomizo at Mon, 2006-12-04 20:58 | login or register to post comments

need to use both

I meant to say we need testing and typing both, in agreement with Isaac Gouy, though he phrased it more as a need to understand what each has to offer. Perhaps I emphasized a pessimistic point of view that even using both, you can't get enough evidence to be really done.

Sorry about my casual suspicion of optimism in typing advocates. Both the pot and kettle are black.

I do like statically analyzing systems. I just don't believe a static analysis completely without testing to agree with static analysis. :-) (The essence of testing is just comparing alternate ways of doing the same thing, to seek disagreement in evidence.)

By Rys McCusker at Mon, 2006-12-04 21:25 | login or register to post comments

100% agreement

In my day job I say that I need to write unit tests because I know that I can't program without making mistakes. Sometimes even the types are wrong or too permissive.

Perhaps I emphasized a pessimistic point of view that even using both, you can't get enough evidence to be really done.

But we can reduce the risks of it being wrong. Sometimes enough evidence is more than enough ;) Reading about the Erlang model for scalability (i.e. assume that processes can have software errors and design a system that can deal with it) was particularly enlightening. Instead of trying to avoid errors and be vulnerable to any piece of code that you believe is correct because it was throughly tested, statically analyzed, etc., you just try to make sure that whatever mistakes it may do are detectable and transient/recoverable.

I just don't believe a static analysis completely without testing to agree with static analysis. :-) (The essence of testing is just comparing alternate ways of doing the same thing, to seek disagreement in evidence.)

Why can't you just trust the compiler? The compiler is your friend and it will never harm you... ;)

The biggest advantage of testing is this form of positive redundancy, which reminds me of physics where you must experiment to corroborate or refute a theory. Because of the nature of experiments a scientific theory can never be proven but it must be falsifiable (if we agree with Popper). So in a sense a good computer program should be amenable to testing or else it can't be falsifiable.

OTOH programs are mathematical artifacts that can be proved, but the effort required to prove it may be (and usually is today) much larger than the effort required to write it. In the end of the day I always try to remember that no matter what we do the software may not do what the user wants, even if it's provably correct, because the user intent is too hard to nail down. Software development really is a strange and fascinating beast ;)

By Daniel Yokomizo at Tue, 2006-12-05 16:46 | login or register to post comments

V & V

... no matter what we do the software may not do what the user wants ...
The old distinction - "Verification is checking that we have built the system right. Validation is checking that we have built the right system." Martyn Ould

By Isaac Gouy at Tue, 2006-12-05 17:09 | login or register to post comments

You Should Care

Let's take an exemple... You have all these classes in your code with a method foo(). For one of these classes, you rename that method to bar(). You have calls to foo() all over your application, but only some of them needs to be replaced by bar(). How many time would that take you to check everything manually ? Now how many time with compiler errors "this object as no method foo" ?

By Nicolas Cannasse at Sat, 2006-08-26 09:44 | login or register to post comments

Test suite

Assuming a test suite with good coverage you can do this quite easily by replacing the relevant foo() implementation with a stub that throws a suitable exception. Assuming exceptions in your language carry decent traceback information you should be able to identify the call sites about as easily as you would with a compiler.

Of course everything in my argument hinges on that initial assumption about good code coverage.

By Per Vognsen at Sat, 2006-08-26 10:20 | login or register to post comments

Testing is better than Static typeing.

Testing is better than Static typeing because it can catch all errors rather than just type errors.

By atbusbook at Mon, 2006-12-04 19:48 | login or register to post comments

A type error in a typed

A type error in a typed language is not the same as a "type" error in an untyped language. There are type systems that can prove things about deadlock freedom, resource usage, referential transparency, termination and more besides. Testing cannot. Most people are, as you can tell, sick and tired of pointing things like this out. You are repeating the same incorrect arguments many people have already made. I suggest you learn more about programming language theory and for the foreseeable future read about - rather than post about - subjects you are not familiar with. A good place to start is the "Why Type Systems Are Interesting" thread.

By rebooted at Mon, 2006-12-04 21:22 | login or register to post comments

What a bizarre statement!

What advantage is there to type-check at compile time rather than runtime?

There's nothing more annoying than running a 2 hour computation only to find at the end that you accidentally initialised an array with an integer '0' because you left off the '.' meaning that from that point onwards every single value stored in it was cast to an integer meaning that the final result was just an array of zeros. The earlier you catch your errors the better. If you can catch your error after a few seconds of compilation rather than after a few minutes (or hours) of execution, then you can fix minor bugs 60 (or 3600) times as fast.

By sigfpe at Mon, 2006-08-28 02:12 | login or register to post comments

Generally then a dynamic

Generally then a dynamic typed language runs in to a type error it throws an exception (you could say that most static languages does this to on for example divide by zero errors). I don't see why this exception would be any diffrent from other exceptions. (and whatever the exception is checked or not seems quite orthogonal.)

By Felicia Li Svilling at Fri, 2006-08-25 07:39 | login or register to post comments

Yes, in the context of depentent type systems.

If null and non-null are considered two distinct types, then checking for null at run-time is similar to run-time type-checking.

By Achilleas Margaritis at Fri, 2006-08-25 09:17 | login or register to post comments

Two different issues

IMO there are two different issues to Ruby's runtime variable-checking nature:
1. Accessing a variable that hasn't been assigned to before throws an exception
This can be a VERY GOOD feature - I don't know how many hours I've spent in the debugger wishing for such a feature in C/C++. It isn't that bad in languages like Java or C#, where variables are automatically initialized, but still, chances are they're initialized to something you didn't expect when you were writing the code that's using the variable (i.e. null).
2. There is no mandatory static-checking-phase for Ruby code
I'm used to comfortable IDEs showing me some programming errors, so I'd agree this is an annoyance. However, I don't think this is a language issue: If you wanted, you could build a tool that checks if if local variables are assigned before they're used, if all member variables that are used are set in the initializer. Tools like that have been written for Python (PyChecker, PyLint), and I don't see why they shouldn't be possible for Ruby as well.

So, from my experience, I'd say Ruby's behaviour is preferable to C/C++'s, and comparable to Java's. The only better (i.e. more anal-retentive) option I see is to require declaration _and_ initialization, like ML and Haskell do.

By Niki Estner at Fri, 2006-08-25 12:00 | login or register to post comments

a tangent

Accessing a variable that hasn't been assigned to before throws an exception . . .

What if accessing a variable that hasn't received an assignment yet were to hang until an assignment to it should occur?

By Jack Waugh at Fri, 2006-08-25 12:43 | login or register to post comments

Hanging variables

What if accessing a variable that hasn't received an assignment yet were to hang until an assignment to it should occur?

Wouldn't that just open the door for a whole new world of deadlocks and thread-sync problems? I mean accessing an uninitialized variable is a programming error, like dividing by zero. Nobody would want a division by zero to hang until the divisor changed to something different.

By Niki Estner at Fri, 2006-08-25 13:31 | login or register to post comments

Once vs. Many Times

I'll save Peter Van Roy the trouble of replying. :-)

You'd be right, if you could assign to the variable over and over again. If, however, the variable in question is a logic variable—that is, can only be assigned to once—then you avoid the deadlock and race possibilities, and instead you have one part of a very nice approach to concurrency. The rest of the pieces are also nicely implemented in Oz, an extraordinarily powerful multi-paradigm language that also serves as the implementation engine employed in the CTM, which I can't recommend highly enough.

By Paul Snively at Fri, 2006-08-25 14:35 | login or register to post comments

Yes, but it realy is an

Yes, but it realy is an irritating error then it shows up in Oz. Especially then you write a single threaded application.

By Felicia Li Svilling at Fri, 2006-08-25 16:51 | login or register to post comments

Oh, I Dunno

To me, having accessing an uninitialized variable diverge is quite a lot more defensible than the popular alternatives: having the variable take on a random value (C, C++...) or having the variable take on an arbitrary default value (Java, I believe C#...). This is especially compelling, IMHO, given the open world assumption. You literally can't know when some other entity (thread) that might not even have been dynamically loaded yet is going to give that variable a value.

By Paul Snively at Fri, 2006-08-25 17:56 | login or register to post comments

Kinds of variables

There are too many useful variable behaviors to stick to a single one.

For example my language Kogut supports out of the box:

let — Constants. May not be assigned to.
var — Plain mutable variables, initialized to the specific value (which is Null by default).
dynamic — Dynamically scoped variables. Can be locally rebound to a different mutable binding. The top binding is optional; if absent, accessing the variable outside any scope where it's given a binding is an error.
lazy — Lazy variables. The expression providing the value is specified when the variable is defined, but it's executed when the variable is read for the first time.
forward — Can be initialized once. Reading before initialization is an error.

It's easy to implement custom kinds of variables, like the mentioned here variables which synchronize threads, or combinations of the above features.

By Qrczak at Fri, 2006-08-25 20:01 | login or register to post comments

Global variables are not under your control???

If you are going to depend on global variables for your code then you had better make sure that it is initialise and not depend on someone else doing it for you. If it is local then not giving it a initial value and depending on the system to do it for you is courting danger. For support purposes, unitialised variables are a nightmare to handle and are a blight on the code.

Unless the language gives you the facility to determine if a variable is unitialised, it is your responsibility to make sure it is.

ICON and UnIcon both allow you to check if a variable has or has not been initialised and in the case of an uninitialised variable allows you to give it a default value. If it has been assigned to then the initialisation assignment is ignored (fails). A useful tool for programming if you are depending on others to set the variable for you.

By Bruce Rennie at Sat, 2006-08-26 07:46 | login or register to post comments

Definition of a variable ...

At the time you define a variable, it seems to me that it is the appropriate time to give it a starting value. I realize there are many languages that don't allow this, but I believe that this shows that the design of those languages is missing something.

My days are involved with languages that define a variable first before allowing it to be given a value and it really is a pain in the neck. If you are going to use a variable in your programming solution, one would expect that you know what you want its first value to be, otherwise what's the point of defining the variable in the first place.

The associated idea is that definition of the variable should take place near to its first use.

By Bruce Rennie at Sat, 2006-08-26 07:16 | login or register to post comments

Dynamic typing; checked exceptions

While in Ruby misspelled method names are indeed detected at the time such a method is called, this is not a necessary property of dynamic typing.

With CLOS-style generic functions scoping is resolved statically and using an undefined name can be detected at compile time. Even some â€œdynamicâ€ errors like bad number of arguments can be often detected statically.

— * —

I agree that checked exceptions in Java are a mistake. See here, here and here.

By Qrczak at Fri, 2006-08-25 12:07 | login or register to post comments

The last article on checked exceptions...

...discussed on LtU

By Chris Rathman at Fri, 2006-08-25 14:29 | login or register to post comments

As an aside, now we have

As an aside, now we have HList in Haskell (though another suitable language could take a shot at it) it might be interesting to build a checked exception monad with a runThis-style function that takes handlers for any exceptions still uncaught at that top level? (and possibly variations that handle any uncaught exceptions etc etc) It seems to me that this is likely to be more useful than the Java-style implementation.

By Philippa Cowderoy at Fri, 2006-08-25 14:35 | login or register to post comments

Question

Can a planning system IN an environment be statically type checked? Of course not! The runtime type check is what the planning process is all about. In other words the failure or success of partial plans is how the correct plan is found. If the environment is equal to the program then the system is analytic and can be static type checked. Otherwise there is an inherent "situation" (ie a nondeterminate state) and a partely analytic system that uses the situation.

By Hank Thediek at Sat, 2006-08-26 12:17 | login or register to post comments

Right!

To recapitulate: the boundary of static typing lies at the dynamic input to the software. It's perhaps worth noting that this is also the boundary of partial evaluation. This isn't as damaging as it might seem at first glance, however; consider, for example, that it's still possible to have a type-safe C-style printf() function. Amazingly, there's even type-safe unmarshaling for O'Caml.

But this leaves genuinely dynamic input, as opposed to reconstituting prior output, as the point at which the dynamic semantics of your language matter, a point that both you and Curtis W have made well.

By Paul Snively at Sat, 2006-08-26 13:17 | login or register to post comments

dynamic input

So what tools does the language need to deal with dynamic input? l seem to see a connection with defensive programming here, is that correct?

It was mentionned earlier that dynamic input should be typed properly but l am not sure what that means in practice. Is that some kind of filtering on the values received?

Finally, does that relate to non-determinism (like in concurrency context) at all?

By Denis Bredelet -jido at Wed, 2006-12-06 11:44 | login or register to post comments

Dynamic Input, Informally

Let me take a stab at explaining what I mean by "dynamic input," in a strictly informal way. In many—I'm inclined to say "most"—software, there comes a time at which the program needs to interact with the user, receiving input, generally natively as a sequence of ASCII or, if we're lucky, UNICODE characters—a string. But the expected data might not actually be of type "string;" perhaps we've asked them to enter their birth year and we're expecting a positive integer. In C or C++, we might employ a member of the scanf() family of functions to parse the string into an int. Then we'd have a value of the right type.

Sometimes, though, the type of the input isn't necessarily a primitive type, nor is the type necessarily predictable at compile time. In such cases, the parsing task becomes more complex, and I would argue that, yes, here is the connection to defensive programming: when the parser encounters unexpected input, how does your program deal with it? Given a set of type definitions, can parsers/unparsers for them be generated automatically? Is it reasonable to expect to be able to handle input that way when it's coming from a human being as opposed to another copy of your program somewhere on the network?

In any case, your intution that there's "some kind of filtering on the values received" is spot on: the basic idea is that there are a variety of ways in which values of any given type can be encoded as strings, so conversely, there are a variety of ways in which a value of type "string" can be "filtered" into a value of a much more specific type, at which point the rest of the program benefits from the type-safety constraints imposed upon values of that type.

By Paul Snively at Wed, 2006-12-06 16:15 | login or register to post comments

Not only user input

Don't forget that the dynamic input could come from a file, an Internet connection etc. Special care is needed if you receive executable input (like SOAP or RMI), for which E has an interesting approach.

I don't see a connection with non-deterministic patterns in the end.

By Denis Bredelet -jido at Wed, 2006-12-06 21:51 | login or register to post comments

there's a third answer

besides "static" and "dynamic", there's also "type inference". Yes, unchecked exceptions are comparable to dynamic type checking (which is a silly expression, only values exist at run time). The problem with checked declarations in Java and C++ is, that they are monomorphic and not inferred. You should simply tell your co-worker, that "Java Is Not The Pinnacle Of Static Typing".

(Unfortunately, even Haskell has unchecked exceptions. Something should be done about this, but I'm afraid it's quite a bit of work.)

By Udo Stenzel at Sun, 2006-08-27 10:14 | login or register to post comments

Don't Feed Bad...

... so does O'Caml. Interestingly, however, there's ocamlexc, "Type-Based Analysis of Uncaught Exceptions, which helps a lot.

By Paul Snively at Sun, 2006-08-27 18:18 | login or register to post comments

Terminology

For the record: static type checking and dynamic type checking refer to the timing and nature of, you guessed it, type checking. Explicit versus inferred types are orthogonal: static type systems of both varieties exist. Haskell, for example, is is statically type checked, and has type inference.

A third dimension is nominal versus structural type equality.

By Ehud Lamm at Sun, 2006-08-27 18:44 | login or register to post comments

With nominal versus

With nominal versus structural I think it's often recognised that the option of both is valuable though? If only because nominal typing allows you to assign intent from the concepts in the programmer's mind in a way that structural typing alone can't attempt.

By Philippa Cowderoy at Mon, 2006-08-28 20:21 | login or register to post comments

User login

Navigation

unchecked exceptions are like runtime type checking?

Comment viewing options

Browse archives

Active forum topics

New forum topics

Recent comments