Static Typing Where Possible, Dynamic Typing When Needed

Instead of hammering on the differences between
dynamically and statically typed languages, we should
instead strive for a peaceful integration of static and dynamic
aspect in the same language. Static typing where possible, dynamic
typing when needed!

Static Typing Where Possible, Dynamic Typing When Needed pdf

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

OK...

As a round-up of desirable typing behaviours, this is pretty good.

One point I think they miss is that the "lazy streams" behaviour of Unix pipes is undermined by XML, because you have to have received all of an XML document before you can determine whether it is well-formed, validate it etc.

Obviously there are typeful ways of doing everything they say you need dynamic typing to do, including handling "semi-structured data", so presumably the question is whether there are cases where a (certain) typeful approach is too cumbersome to be practical.

As much as I admire the HList approach to extensible records in Haskell, for instance, it does rely on some fairly heavy type-hackery. Whenever hackery starts to happen, it's time to start considering alternatives.

Why wouldn't it be possible

Why wouldn't it be possible to begin consuming XML before receiving all of it? I can imagine that you might need to buffer some elements (e.g. insist upon a whole lexical token so you don't get only part of a tag) but XML syntax is so simple that I would expect you to be able to parse it without backtracking--and if you can do that you can tell me whether it is well-formed/valid SO FAR. If it turns out further down that the road that the document is invalid it's just like any other lazy stream: raise an error in the consumer when they ask for the next chunk.

-Max

sax

Why wouldn't it be possible to begin consuming XML before receiving all of it? I can imagine that you might need to buffer some elements (e.g. insist upon a whole lexical token so you don't get only part of a tag) but XML syntax is so simple that I would expect you to be able to parse it without backtracking--and if you can do that you can tell me whether it is well-formed/valid SO FAR. If it turns out further down that the road that the document is invalid it's just like any other lazy stream: raise an error in the consumer when they ask for the next chunk.

This is exactly what SAX does.

Like it too

I like it too: less math, more thinking. One thing I think is underappreciated is the usage of eval-like constructs for run-time specialized objects.

For example, you can easily build a parser run-time with parser combinators [in the standard manner, I should add] but you then miss the many optimizations of a compiler. Generating text as an intermediate and passing that to a compiler is a very valid and easy manner to build fast specialized parsers run-time.

I'm not convinced

This advocates C#-style static typing with some minor modifications, which is not surprising given where the author comes from. "Static OO" people will like it, but neither "static functional" nor "dynamic" people.

Underestimates what typing can do

The authors basically throw away all recent advances in type systems by saying "boy these sure are complicated. That makes them bad." And a lot of their arguments are based on the insanity of the C# typesystem, that is, that all reference types are nullable by default. Also, "static typing is static. How much can it do, really?"
However, a lot of the stuff they propose would be a large improvement on c#. Granted, that isn't saying much, but still.

From the Sublime to the Absurd

Please note that this paper is still very much work in progress and as such the presentation is unpol- ished and possibly incoherent. Obviously many citations to related and relevant work are missing. We did however do our best to make it provocative.

...

Static typing fanatics try to make us believe that “well-typed programs cannot go wrong”. While this certainly sounds im- pressive, it is a rather vacuous statement. Static type checking is a compile-time abstraction of the runtime behavior of your program, and hence it is necessarily only partially sound and incomplete. This means that programs can still go wrong be- cause of properties that are not tracked by the type-checker, and that there are programs that while they cannot go wrong cannot be type-checked.

Are we sure that Meijer had anything to do with this paper? Meijer knows exactly what "well-typed programs don't 'go wrong'" means. He knows exactly what "well-typed" means, what "go wrong" means, and why it's in "scare quotes." All this paragraph serves to do, for anyone who knows even a modicum of type theory, is undermine the notion that the authors do.

Folks who wish to "end the cold war" should probably also avoid using terms such as "fanatics" to describe one side or the other of the conflict—especially in the process of offering evidence that they don't understand what that side is saying.

The impulse for making static typing less partial and more complete causes type systems to become overly complicated and exotic as witnessed by concepts such as “phantom types” [11] and “wobbly types” [10]. This is like trying to run a marathon with a ball and chain tied to your leg and triumphantly shouting that you nearly made it even though you bailed out after the first mile.

Apparently the authors can't tell the difference between "provocative" and "asinine." Meijer knows better than this nonsense.

Unfortunately, I have yet to see the argument for latent typing or even this "third way" that doesn't radically underestimate or simply misunderstand static typing or complain that rich type systems are hard to understand (ironically while claiming that using such a rich type system "in their heads" while developing a latently-typed program that would be "hard to type" in something like Haskell is somehow mysteriously "easy to understand").

Well..

Are we sure that Meijer had anything to do with this paper?

Maybe someone should ask him...?

One good troll deserves another

Unfortunately, I have yet to see the argument for latent typing or even this "third way" that doesn't radically underestimate or simply misunderstand static typing or complain that rich type systems are hard to understand (ironically while claiming that using such a rich type system "in their heads" while developing a latently-typed program that would be "hard to type" in something like Haskell is somehow mysteriously "easy to understand").

Perhaps these threads will help - I suspect you're remembering them selectively.

You Might Think So...

...but I'm not, and in fact I've recently found new information about one of the points of disagreement from those threads: that dynamic typing is a "special case" of static typing. From Oleg Kiselyov's page on an ultimate static type system:
One of the paradoxical issues in the recurring debate about types is Robert Harper's thesis that dynamic typing is a particular case of a static typing. This article has attempted to show the other extreme -- the ultimately precise static typing. Curiously, at this extreme static and dynamic typing converge again. This article presents an example of a type system which lets a compiler detect a latent division by zero in an expression, and flag it as a type error.

On one end are dynamically typed languages. Their (byte) compiler is simple and fast; it cheerfully accepts almost every garbage we can throw at it. The run-time of these languages is inherently slower as every operation must first run a battery of tests on its operands: that the referenced variables are indeed defined, the operands of a div operation are indeed numbers, and furthermore, the divisor is not zero. If a branch of code was not executed in a particular run, the errors in this branch, however grave or obvious, remain unreported.

The ultimately statically-typed "language" outlined above achieves the other extreme. The run-time is now fast and trivial as it only prints the results that have already been determined by the compiler. Compilation on the other hand is highly involved. For example, to make sure that no divisor in all division operations is zero, the compiler basically has to evaluate the divisor expressions -- in effect compute them at compile time. In general, such an evaluation may be involved, and may even fail to terminate.

Thus the extreme static typing looks identically to dynamic typing, only with compile- and run-time phases reversed. The optimal approach seems to be to do some computation at compile time (e.g., making sure all referenced variables are declared, or that operands of any numerical operation are indeed numbers) and leave more tenuous checks and the rest of the computation to run-time. As we know, the middle ground is the most controversial position.

I think this rather nicely sums things up. Regardless, although I still see value in mixing static and dynamic typing (as Oleg does), my point remains: all dynamic typing arguments I've seen (yes, even the ones on those old threads) underestimate or misrepresent the abilities of modern type systems. In particular, the quoted parts of the paper in question do, to my astonishment given Meijer's co-authorship. And once again, if the goal is to find a middle way, they first need to start from an accurate understanding of the statically-typed side. My take from the quoted paragraphs is that they have singularly failed to do so.

Contentless

Granted, I've only briefly skimmed the paper, but the quote you have there as well as the rest of the paper seem pretty... contentless. I'll sum up the quote: "Static types are static, Dynamic types are dynamic".

Now, I'm just a budding type theorist (autodidactic), but isn't one of the foundational ideas behind type theory that it's the behavior and not the representation that is important regarding types? This paper uses C++ types to represent natural numbers. How is this fundamentally different from using bits? Really, all this paper is is an implementation of the natural numbers in the C++ template language. It happens to catch division by 0, but that's nothing profound, really.

The problem here is that the author is equating compile time with static and run-time with dynamic. The problem with that is that compilers have their own run-time, which is dynamic rather than static. Templates allow the programmer to tap into the compiler's run-time to allow a programmer to insert dynamic results into a static environment. In other words, the C++ template system claims yet another victim!

If the extent of static typing is essentially thunks (putting your program result in another program), I'm looking for something else.

You Can Say "In the Limit..."

...static and dynamic typing are equivalent. Remember (or read the threads Anton pointed to, if you haven't already), one of the "holy grails" that has been discussed before is a static-dynamic "slider" of some kind, but that's based on the assumption that static and dynamic typing really are isomorphic, which wasn't demonstrated in the other thread, and in fact there was some question regarding Frank's assertion that (and this is not a direct quote) static type systems can encode any dynamic type system in terms of Univ. Frank backed down from the claim for reasons that I honestly can't recall and don't believe I'd be able to understand from re-reading the threads. Lacking Frank's understanding of the relevant Category Theory, I also backed down from the claim. Oleg Kiselyov's work, of which the C++ template metaprogramming is only one example, supports what he characterizes as "Robert Harper's thesis that dynamic typing is a particular case of static typing" basically by providing an existence proof that static and dynamic typing do converge at both the static and the dynamic limits. That is, they genuinely are isomorphic. I still think that this follows pretty directly from the Futamura Projections, but it's nice to have a demonstration.

But none of this really matters much to my practical objections to the arguments in favor of dynamic typing, which basically lie along two lines:

  1. Dynamic typing has only probabilistic answers to statically typed systems such as TyPiCal or Acute: where TyPiCal prevents your code from compiling if you have deadlock or race conditions, for example, any testing short of 100% code coverage, at which point you're doing a form of type inferencing at runtime, gives you only a probable assurance that you lack deadlock or race conditions. This is a difference of kind rather than degree.
  2. I still haven't seen a defense of dynamic typing that didn't either fail to understand, or at least underestimate, what modern type systems can do. Admittedly, most often that's been due to ignorance, but unfortunately, I'm satisfied that in at least in some cases it's been due to intellectual dishonesty on the part of the dynamic typing supporter. Let me hasten to add that I'm not referring to Anton here!

And you know what? You really won't find me posting much on the subject until we get something like this Microsoft paper, which falls firmly into the fails-to-understand-or-underestimates category. Then I just get angry all over again, and we end up wasting a lot more pixels on the subject. But I can't help it: I have a very big problem with big-name shops or authors promulgating at best misinformation, and at worst, lies.

Watch where you wire that slider!

Oleg Kiselyov's work, of which the C++ template metaprogramming is only one example, supports what he characterizes as "Robert Harper's thesis that dynamic typing is a particular case of static typing" basically by providing an existence proof that static and dynamic typing do converge at both the static and the dynamic limits.

I think this reads too much into what Oleg's example actually demonstrates. Of course, we know that computations can be shifted from compile time to runtime and vice versa. That's not new information. Oleg uses a type system to perform a computation at compile time, taking advantage of one of the basic lessons of lambda calculus: that any system that gives you the ability to create and apply a certain kind of parameterized abstraction, is Turing-complete. Oleg uses type identifiers and type specialization as abstraction and application. It's clever, and fun, but I don't see that it's particularly relevant to the discussion of a static/dynamic slider in type systems — using a type system as the primary computational mechanism in a program is tangential at best.

Afaik, Robert Harper's "thesis that dynamic typing is a particular case of static typing" is merely a way to embed a dynamically-typed program in a statically-typed language, using a data type for tagged values, e.g. a Univ type. I don't know if Harper has any more extensive thesis in this area - if so, I'd appreciate any references.

We discussed this approach in the previous type system threads. The claim is not as strong as it might sound at first. In particular, as Kevin Millikin pointed out, you can also do the reverse, encoding static type information in a dynamically-typed language. So by the same token, we can say that "static typing is a particular case of dynamic typing". It was this observation that led Frank to back off from claiming that untyped languages were a proper subset of typed languages, in this post.

When it comes to a type slider, the Univ embedding approach would presumably mainly be useful if a statically-typed language provided better syntactic support for dealing with Univ values, which would bring its expressiveness for those kinds of operations closer to that of dynamically-typed languages. But I doubt this is all that interesting to static typing fans, and it doesn't address the much more interesting point -- namely, the latent type information that goes unexploited in dynamically typed programs, even when those programs are encoded in a statically-typed language via a Univ embedding. It's that latent type information that a slider really needs to be able to deal with, and the Univ embedding doesn't achieve that.

"The assumption that static and dynamic typing really are isomorphic" has already been shown to the degree that it needs to be for the feasibility of a static-dynamic slider, I think. The real demonstrations of this are not found in Harper's thesis or in Oleg's example, but in the soft type inferencers, which can assign type schemes to programs in (at least some) dynamically-typed languages. These inferencers provide a concrete interpretation of latent types.

I still think that [proof that static and dynamic typing do converge at both the static and the dynamic limits] follows pretty directly from the Futamura Projections, but it's nice to have a demonstration.

The Futamura Projections are more relevant to this issue, but on their own they still fail to address the question of latent type information. They can only work with information that's explicitly encoded into the program (or the underlying language). They would work wonderfully on something like Kevin's example (linked above), because that example contains explicit type information. Much of that information could be partially evaluated away, along with the associated type checks. However, this partial evaluation isn't nearly as useful on ordinary latently typed programs, because the latent types are not explicitly present in the program. Latent type information is implicit at the program's terms, and without explicit type annotations or a way to infer types, there's a limit to what can be done. I believe there's been work on doing various kinds of inference in conjunction with partial evaluation, although I'll have to dig for references.

Harper's thesis

Afaik, Robert Harper's "thesis that dynamic typing is a particular case of static typing" is merely a way to embed a dynamically-typed program in a statically-typed language, using a data type for tagged values, e.g. a Univ type.

I don't think that's what he meant. He was talking about type system design, rather than encoding programs in given languages.

Obviously, you can make a static type system as imprecise as you want, by extending the dynamic semantics with respective "error cases". For example, division by 0 is such a case - the type of division is imprecise in any practical language. A respective ad-hoc semantics is defined for this case.

If you go to the extreme, you can even make the type of functions themselves completely imprecise. Consequently, you allow applying anything to anything, but the dynamic semantics is extended by dealing with the case where the former is not a function, in some ad-hoc way. And suddenly you have arrived at what many call a dynamically typed language.

Sorry if I'm repeating the obvious here. In any case, I don't see how this argument can be inverted to make statically typed a special case of "dynamically typed".

I don't buy Kevin Millikin's argument regarding encodings either, because it relies on some form of staged evaluation, making the whole static/dynamic distinction futile in the first place.

Actually, I don't think...

...Kevin's example shows what you think it does.

So Bob Harper says that if you've got recursive types and function types in your language, you've got Scheme, because you can code up a Univ datatype, and all the tag-checking that a Scheme runtime will show up in the code that manipulates the Univ type. I don't think anyone really disagrees with this idea. (Though note that it doesn't work if the type system rules out nontermination, eg if you restrict recursive types to be inductive.)

Kevin then suggests that you can represent typed data in a dynamically typed language by coding up a typed value as a pair of a type representation and a value. Then, you can define some type-checking operations and manipulate typed data, getting errors when you make a type error.

This sounds plausible, but unfortunately it doesn't work. The reason is that a sound type system gives you a type abstraction property; there are no programs that can distinguish different representations of a type, because all well-typed programs only access values through operations on that type. But when manipulating terms with explicit type representations, programs that look at and manipulate those types are still valid programs, so you don't have an abstraction property.

In Crary, Weirich, and Morisett's paper Intensional Polymorphism in Typer-Erasure Semantics, you can see this idea at work. The pass around explicit type representations to enable optimizations like closure conversion, but have to drop them at abstract types in order to preserve type abstraction. (The earlier Morrisett and Harper paper Compiling Polymorphism Using Intensional Type Analysis will help make understanding the previous one MUCH easier.)

Bootstrapping

programs that look at and manipulate those types are still valid programs, so you don't have an abstraction property.

It's not a stretch of the imagination to see the typechecker "bootstrapped" in Kevin's scenario. Would that not solve this problem?

What do you mean...

...by "bootstrapping"?

bootstrapping

I'd thought this was a common term that would be understood. I basically mean that you could eventually, as in a language, have the typesystem itself be a typed program. For some reason, that made sense as a counterargument when I wrote the post, but now I can't see how it applies... then again, it is 5 am right now so I'm not thinking terribly straight.

Does so! :)

So Bob Harper says that if you've got recursive types and function types in your language, you've got Scheme, because you can code up a Univ datatype, and all the tag-checking that a Scheme runtime [does] will show up in the code that manipulates the Univ type. I don't think anyone really disagrees with this idea.

Right. I don't disagree with it either — after all, we've known this was possible ever since the first untyped language was implemented in a typed language. Any language enthusiast worth his weight in bits should have implemented something like this himself at some point.

However, my main point here is that this idea doesn't affect the latent types in an untyped program. They remain latent even if the program is converted into a typed language using a Univ type. That means that this idea doesn't do anything to address the question of whether or how latent types can be dealt in conjunction with static types, i.e. something like the slider concept.

Kevin then suggests that you can represent typed data in a dynamically typed language by coding up a typed value as a pair of a type representation and a value. Then, you can define some type-checking operations and manipulate typed data, getting errors when you make a type error.

This sounds plausible, but unfortunately it doesn't work. The reason is that a sound type system gives you a type abstraction property; there are no programs that can distinguish different representations of a type, because all well-typed programs only access values through operations on that type. But when manipulating terms with explicit type representations, programs that look at and manipulate those types are still valid programs, so you don't have an abstraction property.

This is just the dual of what happens in the Univ example. In both cases, you're not really subsuming the target language and all its properties — all you're really doing is demonstrating how that language could be implemented. In both cases, to fully achieve the capabilities of the target language, including its expressiveness and abstraction properties, you'd have to go the extra step of providing a syntactic layer. That layer would hide many implementation details, such as the manipulation of the Univ type, or manipulation of terms with explicit type representations. What's the difference?

In Crary, Weirich, and Morisett's paper Intensional Polymorphism in Typer-Erasure Semantics, you can see this idea at work. The pass around explicit type representations to enable optimizations like closure conversion, but have to drop them at abstract types in order to preserve type abstraction. (The earlier Morrisett and Harper paper Compiling Polymorphism Using Intensional Type Analysis will help make understanding the previous one MUCH easier.)

I believe my point above covers this, but I'll check these papers out anyway, thanks.

Okay, let's suppose that you

Okay, let's suppose that you have two implementations of natural numbers. You've got a representation type, zero, successor, and primitive recursion as the abstract operations, and you pack them all up in an existential type. Now, you've got two implementations; one is a list of booleans which is a binary representation, and the other is a list whose length encodes the natural number (the Peano representation), plus implementations of the abstract operations.

In a language with a type abstraction property, it's impossible to write a client program that can produce different outputs when run with the two different implementations. This happens because ill-typed terms aren't programs, and don't have a semantics.

By way of contrast, a Scheme program that looks at the representation of a type-carrying term is still a valid Scheme program, and so you can write programs that distinguish representations. (Eg, (car 4) is a perfectly good Scheme program with a well-defined semantics.)

Having fewer programs (ie, your type system rejects some terms as ill-typed) means that your language gives rise to a coarser notion of program equivalence. This is why the simply-typed lambda calculus can be given a set-theoretic interpretation in terms of types-as-sets and functions as set-theoretic functions, and the untyped lambda calculus doesn't have such a semantics. We don't have to give a semantics to terms like ((lambda (x) (x x) (lambda (x) (x x))) in the STLC, because we don't consider them valid programs.

Now, adding language features that make programs distinguishable is a one-way street; you can't assume every other programmer in the world won't use them when that would be inconvenient. That's why type system features like monads are like roach motels -- you can make expressions monadic, but you can't get out of the monad, because you don't want things like state or control effects spreading into the rest of the language.

Types as functions?

By way of contrast, a Scheme program that looks at the representation of a type-carrying term is still a valid Scheme program, and so you can write programs that distinguish representations. (Eg, (car 4) is a perfectly good Scheme program with a well-defined semantics.)

This surely depends on how you wish to represent types in your embedding? If you represent them as a pair or list (type value) then you can obviously look at the representation. However, there are other ways you could encode type information. One which I have been considering for a while (since reading some very interesting posts here), is to represent a type as a function, which takes a value (or several values) as argument, and does any checks to make sure it is of the correct "type", and then accepts a message (with arguments); a bit like an object. By using partial evaluation, you could apply the value to the type, and get back a completely opaque representation of a typed value. Something like:

# Person:  :: String -> Int -> PersonMessage -> ... (result)
set neil [Person: "Neil" 24]
puts "neil's age = [$neil age]"

The Tcl syntax is because I'm working on an extension for Tcl which does just this (discussed previously), although Tcl doesn't have a partial evaluator, so the representation is still available. In this way, the "type" is really a function which interprets messages on behalf of a value. (I've been wondering whether typeclasses can be encoded in the same way, where the values passed in are first-class types — making typeclasses higher-order functions!) You can then do type-checking by plugging in a dummy value and running through a program applying each message and seeing if it is accepted. Would that not satisfy the "type abstraction" you require?

Huh?

Okay, let's suppose that you have two implementations of natural numbers. You've got a representation type, zero, successor, and primitive recursion as the abstract operations, and you pack them all up in an existential type. Now, you've got two implementations; one is a list of booleans which is a binary representation, and the other is a list whose length encodes the natural number (the Peano representation), plus implementations of the abstract operations.

Or, let's suppose that I have two implementations of sequences. I've got a representation type, empty sequence, add an element, and remove an element as the abstract operations, and I pack them all up in an existential type. Now, I've got two implementations; one is a last-in-first-out stack, and the other is a first-in-first-out queue, plus implementations of the abstract operations.

Are you telling me that it's impossible to tell them apart in a typesafe language, because they typecheck and have the same type?

Wierd.

Specification, Implementation, Visible Type

Are you telling me that it's impossible to tell them apart in a typesafe language, because they typecheck and have the same type?

No that's not what Neel said.

In Neel's example you have two different implementations of the same specification (natural numbers) with the same visible type. The fact that underlyingly one represents numbers with bit patterns and the other with a successor data structure is simply not observable to an outside caller.

In your example, you take two different specifications that have the same visible type and the same implementation details. The fact that they do not conform to the same specification (one is a stack and the other a queue) is by definition visible to an outside caller. (This is precisely what conforming to a specification means: having one's visible behaviour conform exclusively and completely to some expectation.)

Also the part that would be "hidden" in the well-typed instance you are not even distinguishing in your example.

Behaviour vs. Signature

In my experience, many seem to think that by "specification" one means the signature of the operations, while the correct reading of "specification" is behavioural/observational.

This misunderstanding is understandable, since most programming languages provide constructs for specifying interfaces, yet these simply list the available operations not their semantics (axiomatic or otherwise). Thus, in these languages the definition of Queue is simply the list of available methods, not the axiomatic ADT definition.

Obviously, by "type" and "specification" we refer to the strong, semantic, reading. Since the misunderstanding is so common it might make sense to be a little more explicit when the appropriate meaning of the term can be unclear.

Also the part that would be "

Also the part that would be "hidden" in the well-typed instance you are not even distinguishing in your example.

No, I am distinguishing them. I never said anything at all about using the same implementation, so you can presume that one uses a list and one a vector.

This dialog could go very much like one of those Tortoise and Achilles things:

The nice thing about sound static type systems is that they have the type abstraction property: there are no welltyped programs that can distinguish different representations of the same type.

What about a stack and a queue that have the same type? They are easy to distinguish.

That's not the same. You've made them have the same type, but they have different types; their specification is not the same. If you have the same type, but different specifications, then you can distinguish them by definition. The type abstraction property guarantees that there are no welltyped programs that can distinguish different representations of the same type, provided the representations have the same specification.

OK, what about a buggy implementation of natural numbers whose successor function is the identity vs. one of Neel's natural number implementations?

Obviously those are distinguishable because one doesn't conform to its specification. The type abstraction property guarantees that there are no welltyped programs that can distinguish different representations of the same type, provided the representations have the same specification, and if both conform to their specification.

Fine. What if I had a stack that was a linked list, and one that was a fixed size vector of 1000 elements? I could distinguish those, but the typesystem blesses them as the same.[1]

Well, those things are not part of the specification we have in mind for stacks. Usually, for things outside the specification, any behavior at all is acceptable. So, you would expect to be able to distinguish the representations on things outside their specification. The type abstraction property guarantees that there are no welltyped programs that can distinguish different representations of the same type on properties that are part of their specification, provided the representations have the same specification, and if both conform to their specification.

Then dynamically typed languages enjoy the same property: there are no programs that can distinguish different representations of anything on properties that are part of their specification, provided the representations have the same specification, and if both conform to their specification.

OK, out of dialog mode. Does this type abstraction property buy us anything that we don't already have because two representations share the same specification and conform to their specification (both of which must be separately checked in statically typed languages)? If I have two stack implementations in Scheme, and one answers #t to vector? while the other answers #f to vector?, then what might be the problem? Well, if the specification of stacks is mute about whether they should be vector? or not, then there is no problem. And if the specification says that stacks should not be vector? (for instance), then one of them is a buggy implementation.

[1]I can write programs to distinguish Neel's two examples: one is a linear size representation with a constant time successor function and the other is a logarithmic size representation with a successor function that sometimes takes logarithmic time. What we would do is measure memory consumption vs. number of calls to successor, or else measure the time taken for calls to successor.

Jabberwocky

I never said anything at all about using the same implementation, so you can presume that one uses a list and one a vector.

You didn't mention commercial whaling in 19th Japan either, but I would find it quite odd to start discussing the type safety of ambergris. ;-)

By Ockham's razor, as well as the principles of clear examples and reasonable discourse, it is normally safe to assume that all and only relevant details have been given.

The type abstraction property guarantees that there are no welltyped programs that can distinguish different representations of the same type on properties that are part of their specification, provided the representations have the same specification, and if both conform to their specification.

I think we still have a language problem here. There seem to be two distinct uses of specification in your post: the behavioural specification (semantics) of your program in general and the "specification" guaranteed by the type system for the type signature.

A type system only makes the guarantees that it can: if your type system cannot distinguish between stacks and queues, it isn't reasonable to fault it for not distinguishing between the two (even though you can).

(It does occur to me though that there might be some variant of linear typing that could be made to distinguish between the two types: maybe someone wants to research it? ;-) )

A type system makes certain guarantees about particular ways your program can't go wrong. This can also be expressed in the way Neel does as saying that your type system can ensure that particular internals cannot be visible.

The fact that it can't prevent ALL problems or distinguish ALL differences of behavioural specification seems to be a red herring (or, if you prefer, a wayward turtle).

Obviously Not

Kevin Millikin: Or, let's suppose that I have two implementations of sequences. I've got a representation type, empty sequence, add an element, and remove an element as the abstract operations, and I pack them all up in an existential type. Now, I've got two implementations; one is a last-in-first-out stack, and the other is a first-in-first-out queue, plus implementations of the abstract operations.

Right, now you've introduced state into the equation...

Kevin: Are you telling me that it's impossible to tell them apart in a typesafe language, because they typecheck and have the same type?

neelk not only said no such thing, he addressed the issue explicitly in the post you're responding to:

neelk: Now, adding language features that make programs distinguishable is a one-way street; you can't assume every other programmer in the world won't use them when that would be inconvenient. That's why type system features like monads are like roach motels -- you can make expressions monadic, but you can't get out of the monad, because you don't want things like state or control effects spreading into the rest of the language.

Adding state qualifies as "adding language features that make programs distinguishable" with a vengeance. A plain ol' HM type system won't protect you from this, but some kind of linear type system will. So let's be careful, once again, to distinguish between what "static typing" can do, and what the most popular (Hindley-Milner, Standard ML, Haskell, O'Caml...) type systems do.

Right, now you've introduced

Right, now you've introduced state into the equation...

No, I didn't. I have in mind two purely functional implementations.

So let's be careful, once again, to distinguish between what "static typing" can do, and what the most popular (Hindley-Milner, Standard ML, Haskell, O'Caml...) type systems do.

Of course, I agree completely. But isn't that what Neel's criticism (that dynamically typed languages are unable to hide the representation of data) is? Don't most of the dynamically typed languages already do exactly that?

dynamically typed language are FUN!

This is one of dynamically typed languages' best defense. Paul, haven't you ever enjoyed programming in a language like Perl, Python or Ruby. What about Scheme?

Fun is where you find it

This is one of statically checked languages' best defenses. Steven, haven't you ever enjoyed programming in a language like Haskell, Scala or OCaml?

Totally!

Absolutely; I love programming in Common Lisp and Scheme. I wrote somewhere—I can't seem to find where at the moment—that the error I'd made was in assuming that what I found fun about programming in Lisp/Scheme, the try-and-go aspect of it, was intrinsic to dynamic typing. But it turns out that it isn't: I have that same try-and-go experience in O'Caml. This was a revelation! So now I have the fun in at least one statically-typed language that I used to have in my preferred dynamically-typed languages.

Note that I'm still very keenly interested in Oz, another dynamically-typed language. So all hope is not lost. :-)

Zero Kelvin Systems

For an academic paper, he should probably tune it down, and I am pretty sure he will. But hey, it's a draft version isn't it? The points he made are pretty valid, I do agree that often the typesafe argument is 'oversold' in academic environments - up to the point that one can mistakenly take a protagonist for a fanatic (but hey, if the turing test makes it a fanatic ;0).

As far as I can see, he starts by opening his article by stating that it intends to be provocative; i.e., it should be read as a column. Columnists are there to entertain and provoke - there are no other rules in columns. (Hehe, new motto for LtU - columnists to science ;-)

Columnists bring freedom. Prior to public debates, in my country, sometimes 'comedians' introduce debaters and/or the topic with some ironic humor. They serve one purpose - they entertain and provoke, but also, they loosen the rules by building a liberal atmosphere where more can be said and 'politically correct' assumptions can be challenged.

By peer pressure alone academia is always at risk at creating a zero-kelvin system where everybody carefully agrees with each other and nothing moves. I don't like zero kelvin systems - and find them unacademic to start with. So, applause for Meijer.

Originality is good

...but not at all expenses. This "column" makes several factual errors and mistakes. Just disagreeing isn't enough. One should have reason to disagree. "I disagree with the church-turing thesis"- OK, great. Why? What reason? "Because such a simple machine can't possibly do what C does". No, you're an idiot. An "original" idiot, but still an idiot. You're going against the grain of academia, but you're still stupid.
Granted, the errors in this paper aren't that glaring but the idea remains the same. Yes, the authors are academic "rebels". But they're still wrong.

Abstract Interpretation

Any opinions on why abstract interpretation isn't more popular? I'm toying with a interpreter/compiler for a Joy-like language, and of course the issue of typing came up. But instead of having static type inference or latent type checking, I've been interested in executing programs with types instead of values. So, for example, primitives like the '+' function wouldn't operate on bits that represent numbers, it would operate on tags that represent types. And you could run this program on types before you ever try to run it on values, so you could catch type violations at 'compile' time. Or instead, for each potential type error, you might decide to insert a run-time check into the program which operates on values. Or maybe, a little bit of both. You decide that some type violations are bad enough to warrant stopping the compiling process, but others get the run-time checks.

It seems like a variation on this scheme could get a person as close to blurring the line between static and dynamic types as he wanted. (And I'm interested if anyone has pointers to literature which discusses similar ideas)

The advantage of type-checking over AI...

...is that typechecking can be modular, and abstract interpretation can't. To do an abstract interpretation, you must have the whole program available, which isn't the case with type systems.

modularity of AI

It seems like I've come across a paper by one of the Cousots about viewing type-checking as a particular instance of AI... Could you give a (novice) explaination of why AI has modularity problems?

similar philosophy in boo

"Static typing where possible, dynamic typing when needed!"

Boo has a similar general goal. It's statically typed with type inference, but you can explicitly declare something should be dynamically typed (it uses the term duck typing), or else pass a command line option to enable implicit duck typing, like python. Since the compiler is open, no need to wait years for Microsoft/Sun to get around to adding features like type inference, etc.
See: Boo Home Page, duck typing, type inference, differences with python, differences with C#, AST macros, boo addin for sharpdevelop (shows how you can have code completion for the static parts).

Auto-Complete

More than safety I think the biggest benefict of static typing is intellisense/auto-complete wich is a major productivity boost in big projects.

This said I think that type inference would be great because I do not like to see repeated information.

I disagree: readability is key

For me, the great thing about type inference is better readability thanks to a reduced number of unecessary duplication which only clutter the source.

I remember the first time I saw about type inference in a statically typed language was in Limbo (programming language for Plan9), was Limbo really the first language to do this (I doubt it)?

As for autocompletion, frankly I don't think that it is such a big win, for me it is useful in Java only because this language is quite verbose..

Re: AutoComplete

strange as it may seem - you can have anutocomplete in a dynamically typed language as Smalltalk (see visualworks 7.x)

Smalltalk also has a jump on refactoring

And not to mention object browsers (using an image does have it's advantages). Personally, I find autocomplete to be about as annoying as it is helpful - sometimes nice, sometimes distracting. But then I have a hard time understanding how such a feature actually boosts productivity, seeing how writing code is the easiest part of my job.

Such systems which document b

Such systems which document basic interfaces are a big boost in the productivity of novices and weak programmers dealing with simple code and extensive libraries.

Refactoring - Java also

For the last couple of years Java IDEs (and now C# IDEs) have provided refactoring support.

And finally we seem to be getting something like Smalltalk's rewrite engine (pdf) - JackPot

Rule-based Java rewriting predates Jackpot

it's been available commercially for at least a year now in IntelliJ IDEA. Indeed, it's tough to tell from the press releases what of Jackpot isn't already commercially available, beyond James Gosling's name.

What has been available...?

Thanks, but what has "been available commercially for at least a year now" - an AST based rewrite engine?

Yes

It's called (not particularly originally) "Structural Search and Replace". It's less useful than it might be because the built-in refactorings and code audits cover many of it's more obvious possible uses better, but I've used it for batch API transformations, layer replacements, and such.

some autocompletes are more complete

With type information autocomplete can be more specific - which makes it more useful.

A good start...

but he missed some.

When I say I like statically typed systems, what I often really mean is...

that I like systems which force my coworkers to at least minimally document their systems (note that implicit static typing actually fails at this almost as badly as dynamic typing).

that I like systems where error checking is available during coding-phase or even design-phase. Compile-time error checking is far too late. Stock run-time error checking is essentially useless.

that I like systems where common coding idioms can be mechanically detected. That includes both correct idioms (for reverse-engineering and style-enforcement purposes) and bug idioms (for static bug-finding purposes).

that I like systems where all usages of a particular piece of functionality can be found safely and completely, within 1-5 seconds for a 500KLOC system.

that I like systems where type coercions aren't silently inserted into my code. This is admittedly a problem with many statically typed mainstream languages, but it's far worse with the mainstream dynamic languages.

that I like languages where the security model actually has a chance of stopping insecure usages from dynamically loaded code.

that I like systems which can be run without too much interprettive overhead. Processor cycles are not, sadly, free.

When I say I like dynamic languages, I often mean...

that I like languages where reflection is easily accomplished, particularly reflective interception of function/method/procedure calls and reflective analysis of the running code-base.

Hmm, that seems to be it for me on the dynamic side.

Each camp will undoubtedly claim that there are experimental or non-mainstream languages or systems which bridge the gaps I've pointed out. As a languages junkie I find those fascinating. As a professional programmer, rational ignorance makes more sense. Until those lovely gems from the fringes of type theory make it into languages that can be productively used, the authors points are well taken, and a valid critique of programming-as-it-is-actually-done.

Nothing to Disagree With...

...but this just reinforces my point that authors of these diatribes need to be clear that they're critiquing "static typing" as implemented in: C++, Java, C#...

The gap even between these languages and a not-at-all-fringes-of-type-theory language like any member of the ML family, or Haskell, is vast: none of the popular languages even does type inference. All require some level of unsafe-at-runtime casts to express some idioms that are statically safe in ML or Haskell. None has anything even remotely resembling ML's module system. And so on. And this is all without considering anything as out there as GHC 6.4's GADTs, or even the more modest, but still unfamiliar, capabilities such as "phantom types" in O'Caml or presumably other ML implementations.

So this is another issue that I have with the debate: you can take venerable old Standard ML (we're talking about 1975 here, folks) and it still is a vastly safer language than C, C++, Java, or C#, and arguably more expressive, too, although the object purists will disagree with that. I'm sorry; I just don't have any sympathy for an argument that claims that Hindley-Milner type inferencing and parametric polymorphism are "too novel."

And the argument that the MS paper critiques programming-as-it's-done-today misses the mark also, IMHO: the paper itself doesn't say any such thing, and in fact goes out of its way to mention GADTs and "wobbly types" as being "too complex." Well, perhaps they're complex for the compiler developers, and maybe they're difficult for programmers to apply at first because they're unfamiliar, but anyone who says they're too hard to apply, but doesn't complain bitterly about the C++ STL or boost libraries, is being either forgetful, having overcome the learning curve in using these complex tools, or intellectually dishonest. The simple truth is that we rightly expect our programming tools to support us in applying greater and greater levels of abstraction over time, and each introduction of new leverage in that regard, in whatever language, incurs a short-term cost in favor of a long-term gain. Some folks will choose not to invest in learning about the new tools, and that's OK; maybe they'll never be working in a domain in which they'll benefit from them, so with a degree of emphasis on backward compatiblity in the new tool set, no harm, no foul. But to go off on a rant about how awful template metaprogramming is, or GADTs are, ignores the benefits of the new tools in favor of an exclusive focus on their temporary costs. In the end, I suppose this is what I'm referring to when I say "intellectually dishonest." There's nothing wrong with pointing out the cost of something, nor is there anything wrong with saying "I don't expect ever to take advantage of it." But there is something wrong with saying "And you shouldn't/won't either."

baby steps for a different audience

...need to be clear ... "static typing" as implemented in: C++, Java, C#...

I agree and disagree, it depends what audience they are trying to reach.

not-at-all-fringes-of-type-theory language like... none of the popular languages even does type inference

Doesn't that suggest to you that 'not-at-all-fringes-of-type-theory languages' are still out-there-on-the-fringe of programming languages?

anyone who says they're too hard to apply, but doesn't complain bitterly about the C++ STL or boost libraries, is being either forgetful, having overcome the learning curve in using these complex tools, or intellectually dishonest

Or it's not particularly relevant to advancing the points they want to put forward in a 6 page paper!

It's safe for me to assume that Erik Meijer knows more about type systems than I ever will - so I can cut them some slack and wonder who the authors want to reach, and what message they are trying to convey. Maybe they are trying to frame a discussion on how C# should develop to attract Java 5 and Python programmers?


But there is something wrong with saying "And you shouldn't/won't either".

They say nothing of the sort - perhaps for you static typing is possible everywhere, dynamic typing is needed nowhere :-)

You're Mostly Reiterating My Points!

Isaac Gouy: I agree and disagree, it depends what audience they are trying to reach.

Well, I remain a bit confused as to who the paper's audience was intended to be myself. Part of my critique is precisely that: they go so far as to complain about phantom types and GADTs, but the bulk of the paper end up complaining about dramatically more limited forms of static typing. Again, I think this reflects a perfectly human conflation of what are, upon reflection, two different categories of complexity: the temporary complexity of becoming familiar with tools like phantom types or GADTs vs. the ontological complexity that developers of compilers employing such rich type systems face. Generally speaking, I'm sympathetic to such a conflation; it's not transparently obvious where the line between those complexities lies. But Erik Meijer most assuredly has a better idea than 95% of the programming population!

Isaac: Doesn't that suggest to you that 'not-at-all-fringes-of-type-theory languages' are still out-there-on-the-fringe of programming languages?

That's precisely why I said "of-type-theory:" to the extent that the argument I was responding to failed to observe that there are respects in which even something as simple as Standard ML, or any other language with a Milner-Damas type system and a module language such as Standard ML's, is safer than the popular alternatives, one can observe that you haven't gone anywhere near as far out as GADTs/wobbly types to get there, i.e. that you're talking about a type system and inference regime that's been around since 1975, and so is roughly contemporaneous with popular languages such as Pascal, and with the granddaddy of the popular object-oriented programming languages, Simula-76. That is, it's not reasonable to claim that the type system of what, in the literature, is called "Core ML" as a shorthand for "the simply-typed Lambda Calculus with let-polymorphism" is novel, poorly-documented, and/or poorly-understood, as a counterobservation to its practical value.

Isaac: Or it's not particularly relevant to advancing the points they want to put forward in a 6 page paper!

Possibly, but if the goal is to actually advance an argument that doesn't omit or misrepresent information, then they may need to target such a paper at someone other than the ADD set and write a longer paper.

Isaac: It's safe for me to assume that Erik Meijer knows more about type systems than I ever will...

Why?

Isaac: so I can cut them some slack...

This seems like a perverse perspective to me: it's precisely the fact that I know that Erik Meijer knows better that causes me to hold him, and papers that he co-authors, to a higher standard than would otherwise be the case.

Isaac: ...and wonder who the authors want to reach, and what message they are trying to convey. Maybe they are trying to frame a discussion on how C# should develop to attract Java 5 and Python programmers?

Again, if that's what they said, that would be great, and they could even get closer to that goal by just removing the nonsense paragraphs about "static type fanatics" and phantom/wobbly types. If they wanted to go one step farther, they could add some verbiage about wanting to address C#, Java, and Python programmers. But the explicit thrust of the paper is to identify some middle ground between static and dynamic typing in general. That's a totally worthwhile goal—one that I heartily agree with—but it will not work when the starting position contains flat inaccuracies about the nature of static typing and/or type theory, beyond what we can accept as opinions, even strongly-held ones, about the value or desirability of the properties of type theory and/or static typing.

Isaac: They say nothing of the sort - perhaps for you static typing is possible everywhere, dynamic typing is needed nowhere :-)

Heh. I'll have to take your claim that they say nothing of the sort on faith, given that they've characterized essentially all modern type theory researchers, including, one would think, Meijer himself, and several language designers as "fanatics." As for static typing being possible everywhere and dynamic typing needed nowhere for me, you can find what I feel is my best writing on the subject in my Good Questions! response to a question from axilmar.

Slight correction

It's Simula-67 (not 76). Smalltalk and C++ were both influenced by Simula language. Alan Kay started messing with Simula in the late 60s to early '70s but Smalltalk took almost a decade to form.

Whoops...

... you're quite right: Simula-67 to the first Smalltalk in 1969, so these predate the first ML by roughly 3-5 years. I don't think this does appreciable damage to my thesis. :-)

really?

You're Mostly Reiterating My Points!

maybe these were points made in some other part of the discussion which I skipped through.

confused as to who the paper's audience

one clue is "Intended for submission to the Revival of Dynamic Languages"

it's not reasonable to claim that the type system of ... "Core ML" ... is novel, poorly-documented, and/or poorly-understood, as a counterobservation to its practical value.

The only ML I found in the paper was XML - how much are you reading between the lines?


but if the goal is to actually advance an argument that doesn't omit or misrepresent information, then they may need to target such a paper at someone other than the ADD set and write a longer paper.

imo that complaint is both bizarre and unfair.

OK, Maybe Just One Point

Isaac Gouy: maybe these were points made in some other part of the discussion which I skipped through.

Perhaps I was mistaken, but it seemed to me that we largely agreed that the success or failure of the paper to make its points was dependent upon its target audience. I took that to mean that you felt, as I do, that the paper's target audience isn't as clear as it could be.

Isaac: one clue is "Intended for submission to the Revival of Dynamic Languages"

Not much of a clue; starting with that, I'd expect the paper to focus a lot more on the positives of static typing if its goal is indeed to attempt a meeting-in-the-middle. Instead, if we accept that its target audience is the same audience likely to be participating in a "Revival of Dynamic Languages," then it seems like this paper, or at least the part that I feel strongly misrepresents static typing, is a huge chunk of red meat to toss to the faithful. That's not necessarily accurate or invalid if it is accurate, but given that I choose to accept their stated goal of a meeting-in-the-middle as being offered in good faith, I find it hard to reconcile, and since I find it hard to reconcile, I choose to conclude that I am confused. If I am not confused, then either their stated goal of encouraging a meeting-in-the-middle on static vs. dynamic typing was not offered in good faith, or they (perhaps inadvertently) have done damage to their goal a matter of mere paragraphs into the first page.

Isaac: The only ML I found in the paper was XML - how much are you reading between the lines?

Really, Isaac, this is a bit beneath you: the paper makes explicit reference to phantom types and wobbly types. It's well understood among the readership here that wobbly types, or GADTs, are a generalization (that's what the "G" stands for) of the parametric polymorphism that's been found in the likes of ML and Haskell for many years. It's likely even better understood that "phantom types" can be employed even in the relatively simple type systems of Standard ML or pre-GHC 6.4 Haskell, but don't have to be. To make the point that there are practical benefits to the type systems found in such an ML or Haskell above and beyond the benefits of static typing as found in C++, Java, or C#, but without making use of phantom or wobbly types, doesn't require justification, or at least it doesn't seem like it should given the audience here—not even the local dynamic typing fans deny that point. In any case, part of my point was precisely that the paper fails to mention these existing languages, as if the next point on the continuum were GHC 6.4:

<---C++/Java/C#--------------wobbly types--->
but they missed a step:
<---C++/Java/C#-----SML/O'Caml/Haskell 98-----wobbly types--->

Isaac: imo that complaint is both bizarre and unfair.

I don't see why. You said that you thought something might be irrelevant to their advancing the argument they wanted to advance in a six-page paper. My response was that if such brevity necessitated the level of inaccuracy found in the paper as it stands, then they should write a longer-in-order-to-be-more-accurate paper. I phrased this in such a way as to indicate my semi-serious belief that the only justification for maintaining the current selection of brevity over accuracy would be to appeal to an audience with an extraordinarily limited attention span. I guess I should have attached a smiley to that, but a smiley would be too strong: a six-page paper is, after all, a junior high-school writing assignment, or at least it was when I was in junior high school. I have grave doubts about anyone's ability to articulate the issues inherent in static vs. dynamic typing, let alone promote a reconciliation of them, in six pages!

That's not necessarily accu

That's not necessarily accurate or invalid if it is accurate, but given that I choose to accept their stated goal of a meeting-in-the-middle as being offered in good faith, I find it hard to reconcile...

I think it's awfully convenient that VB (Variant for VB6; Universal for .NET) has been in the middle...

focus and scope

focus a lot more on the positives of static typing

From the perspective of "Revival of Dynamic Languages" the positives are that a statically checked language can be more like a dynamically checked language.

a huge chunk of red meat to toss to the faithful

Could it be that bludgeoning "the faithful" with how inadequately they understand static typing may discourage them from reading the rest of the paper.

Could it be that a nice chunk of red meat will put them in a more receptive frame of mind.

(For more of a clue google "Revival of Dynamic Languages")


part of my point was precisely that the paper fails to mention these existing languages

No doubt they'll cover that in the paper "An exhaustive history of statically checked programming languages". I haven't seen how you think these omissions are relevant to the points the authors attempt to make throughout the paper.

(They mention XML - should they be criticised for not discussing SGML? Patrick seems to bash them, on his blog, for not discussing TDD!)


bizarre and unfair... I don't see why

Bizarre because you're imposing a goal (omit no information) which may simply obscure what they have written.

Unfair because the article length may have been dictated by Revival of Dynamic Languages not by the authors.

a six-page paper is, after all, a junior high-school writing assignment

The Declaration of Independence is shorter than that - but it seemed to have an effect.

More Reading-Into Fun!

Isaac Gouy: From the perspective of "Revival of Dynamic Languages" the positives are that a statically checked language can be more like a dynamically checked language.

Sure, of course, that's the rationale that all of us who are interested in type theory adhere to, and it's specifically what motivates research into richer type systems than are found in the popular languages. I'm satisfied that you, I, and virtually all readers of this post understand and appreciate that. What's not at all clear to me is that the authors of the paper do, let alone their target audience. If they did, it would be child's play to edit a couple of early paragraphs to say something like "As of this writing, efforts to give static type systems something closer to the expressive power of dynamic types have resulted in concepts such as 'phantom types' and 'wobbly types,' which, while being more expressive, suffer from lack of familarity by virtue of only being available in research languages such as version 6.4 of the Glasgow Haskell Compiler, from poor error messages in these early implementations," etc. "We should encourage type researchers in these efforts, but without compromising on ease-of-use considerations. Phantom types and wobbly types aren't yet usable outside of the type research community..." "More modest improvements on the static type systems of C++, Java, or C# can be found in current functional languages such as Standard ML, Objective Caml, or Haskell 98." See? Easy. Well, apart from needing a good editor...

Isaac: Could it be that bludgeoning "the faithful" with how inadequately they understand static typing may discourage them from reading the rest of the paper. Could it be that a nice chunk of red meat will put them in a more receptive frame of mind.

I think what I just posted makes the point nicely ("efforts to give static type systems something closer to the expressive power of dynamic types") without resorting to calling people "fanatics," simply being wrong about what "go wrong" means, or offering no hope of tools like phantom types or wobbly types ever being available in an intuitive, legible language (to C++/Java/C#/Python... programmers).

Isaac: No doubt they'll cover that in the paper "An exhaustive history of statically checked programming languages". I haven't seen how you think these omissions are relevant to the points the authors attempt to make throughout the paper.

It only took me a few sentences to say:

  1. C++/Java/C# exist at a certain point on the expressive/complex spectrum.
  2. Wobbly types exist in (a) language at a different point on the expressive/complex spectrum.
  3. There are several other languages that exist between the first two points on the spectrum that might serve as exemplars of where static typing has been since 1975 and lauching-off points for where it could go in the future.
Again, since the whole stated point is to "meet in the middle," why not mention languages that are a whole lot closer to the middle, e.g. supporting type inference, than the C++/Java/C# family? It incurs almost no cost, especially if you elide the paragraphs that are either bombastic or just wrong.

Isaac: They mention XML - should they be criticised for not discussing SGML? Patrick seems to bash them, on his blog, for not discussing TDD!)

They mention XML as an example of semi-structured data that dynamically typed languages can handle and statically-typed languages can't. Wrong again. Oh, wait; they mean C++/Java/C#. "Now, the circle is complete."

Patrick's point is the one generally offered by the dynamic typing camp: TDD is equivalent to static typing. It isn't, unless you have 100% coverage and a model of your language's dynamic semantics, which no one does. This is why, as Jeff Cutsinger noted elsewhere, I keep bringing up systems like Acute and TyPiCal: dynamic typing has nothing like them other than a probabilistic approximation to them. That makes crystal clear that the question at hand is: are you willing to accept the odds of failure, which is a fundamentally economic question. But this isn't an economics blog, the paper isn't a paper on software economics, and isn't addressing an economic audience. It's making specific claims about static and dynamic types, so it'd be nice if they got some basics about static typing right.

Isaac: Bizarre because you're imposing a goal (omit no information) which may simply obscure what they have written.

What's bizarre, to me, is that you seem to prefer that they definitely do obscure some facts about type theory and existing statically-typed languages, rather than accept some simple changes that lack such obscurantism out of a fear of something that might be... what? More accurate but harder to understand? Something that might motivate someone who's never been exposed to any member of the ML family to say "Wow, I had no idea there were languages that don't require me to annotate my terms with types, but will still tell me at compile time if I screwed this up. Is this stuff any good? Let me try it!" etc.

Isaac: Unfair because the article length may have been dictated by Revival of Dynamic Languages not by the authors.

Fair enough, but as I pointed out, they can replace a few paragraphs with a few sentences, actually making the paper even shorter, unless they wanted to become accurate about semi-structured data and some static type systems. As a Lisp/Scheme/Smalltalk programmer myself, I can tell you as a matter of personal history that that's exactly the kind of presentation that finally overcame my resistance to the syntactic odditites of ML and considerable skepticism regarding static types and type inference. I suspect it could also work well for others.

Isaac: The Declaration of Independence is shorter than that - but it seemed to have an effect.

Great. I'll await the Thomas Jefferson rewrite of the paper, then, because as it stands it rather more closely resembles a Jack Chick tract.

funny

Oh I get it! You're cleverly demonstrating an inflexible response to unexpected runtime context :-)

Could it be that they chose to mention 'phantom types' and 'wobbly types' for comic effect?

why not mention languages... supporting type inference...

Ummm would that be section 2.1?

Could it be that they have some interest in promoting Cw! (Rather than ML, or...)


What's bizarre, to me, is that you seem to prefer...

I simply recognise that the authors have written a position paper, to promote their own agenda (not yours or Patrick's), unconstrained by the rigid conventions of an academic paper.

Gosh darn'it there's more than one way to write effectively.

OK...

...then I rescind my willingness to accept their claim to wish to reach a middle ground between static and dynamic typing as having been offered in good faith, in which case they're merely engaging in demagoguery. I'm glad that we've been able to resolve this satisfactorily.

C# has type inference

C# does now have type inference to support LINQ, although it isn't ubiquitous. C++0x plans on having a type inference engine as well, so those languages are moving towards the ML side of the spectrum.

"Type inference"

Just to be clear: what is referred to as "type inference" in the context of these languages is mostly just trivial bottom-up type derivation, which does not require much of an "engine". It has only little to do with what functional programmers associate with that term.

"Coding phase" vs "compile time"

One comment caught my eye:

"that I like systems where error checking is available during coding-phase or even design-phase. Compile-time error checking is far too late. Stock run-time error checking is essentially useless."

"Coding phase" error checking (by which I assume you mean error-checking that occurs as humans enter or inspect the source code) is OK, but "compile time" error checking (by which I assume you mean errors caught by the compiler when compiling) is "too late"?

How in the heck does that work??? Pardon me if I sound rude (I don't mean to be), but are you entering all of your code in a phase apart from typing "make"? I can't think of any circumstance in which that would be a sound development practice, other than a horrendous codebase that takes eons to compile (in which case you have more serious problems).

Stock run-time error checking can be quite useful in many problem domains (including most "IT" tasks; realtime and mission critical SW is another matter), if you've got the right tools.

While I agree that manifest typing is useful as it provides visual clues to programmers what is going on--especially if there are more than one programmer on the project--but overzealous reliance on "human typecheckers" strikes me as quite an anti-pattern

Unsatisfied

The paper opens with a paragraph that talks about what the advocates of static typing argue. Then, for some reason it brings up what the fanatics of static typing argue, and it shoots their claims down. Then, it talks about what the advocates of dynamic typing argue. Then, it mentions a few papers and seems to assume the position that removing the static type system from a language could not possibly improve it.

Then it mentions a paper that claims that the vast majority of digital data is not fully structured (this deserves its own thread, imho). And it seems to imply and accept (but not argue) that dynamic typing is more suitable to handle this kind of data.

I would have loved to read more on that. The last paragraph on page 1 just doesn't do it for me. I would have loved to see C/C++/etc. programs that were written decades ago, working on not fully structured data (was all data well structured then ? What happened ?), and being very inefficient. I would have loved to see examples of a dynamically typed program handling the same data like a champ.

Another thing that caught my eye was that even though the conclusion talks about a paradigm neutral language, a lot of the examples were for object oriented languages. Without going into a functional vs OO discussion, it should still be possible to demonstrate how the addition of a dynamic subset to C or Haskell improves its ability to handle the so-called not fully structured data.

Good Points

I was also surprised to see no discussion of XDuce, CDuce, or Xtatic, nothing about regular-expression types... basically, the more I think about it, the more the entire paper just feels like a strawman to me. All provocation, no substance.

Nitpick

That's why type system features like monads are like roach motels -- you can make expressions monadic, but you can't get out of the monad, because you don't want things like state or control effects spreading into the rest of the language.

Not that you don't already know this, but:

import Control.Monad.State

insideMonad :: State Int ()
insideMonad = do
   x <- get
   put $ x * 2

outsideMonad :: (Int) -> (Int)
outsideMonad = execState insideMonad

That is, some monads - IO, for instance - are like roach motels, and others aren't. The state and continuation monads, for example, can pass values back into "pure" code. On the other hand, you can't get a continuation out of the continuation monad and start passing it around - so the effects encapsulated in those monads are still contained.

What's so bad about rule programming?

One wonders why logic programming is never mentioned in connection with type checking? When a logic program works it is an explicit proof of the program including both data types and behavior. Also backward chaining prevents execution from going beyond any error. What more could one ask?

Nothing!

Hank Thediek: One wonders why logic programming is never mentioned in connection with type checking? When a logic program works it is an explicit proof of the program including both data types and behavior. Also backward chaining prevents execution from going beyond any error. What more could one ask?

Well, it depends upon the logic in question: Prolog doesn't handle anything more expressive than Horn clauses. Oz does, but is dynamically typed (note that this doesn't stop me from intending to become fluent in Oz). I think a very interesting question is: what would a statically-typed logic language look like? The best answer so far seems to be Mercury, which I've never used. I did find this interesting comparison of Mercury's and Haskell98's type systems, however, that makes me wish I knew more.

Logic with types

I think that Lambda Prolog is a good example of what this might look like. Also worth looking at is twelf, although I have to admit I have yet to fully grok it.

Frank Pfenning's notes on his

Frank Pfenning's notes on his Computation and Deduction class are really helpful. To read them you should be familiar with (say) the simply-typed lambda calculus and its soundness proof, and then you should ask (many) questions on the Twelf list to bring you up to speed. Twelf is way awesome, but understanding it takes quite a bit of background knowledge.

Not Horn clauses.

You're confusing datatypes with mathematical logic.

Rather, nested terms.

Which, IMO, are good enough as a base to emulate any other type you could wish for. (From an algorithmic point of view, at least.)

C# 3.0

How much does this paper tell us about C# 3.0? Here's one speculation...

FP compilers are advanced AI?

I liked this bit:

It [type inference] requires an advanced AI technique called unification

Prolog gave unification a bad

Prolog gave unification a bad name...

More like AI gave Prolog and Lisp a bad name

Gave them the wrap of being pure theory and useful for only esoteric purposes. Calling something AI these days is tantamount to labelling it as a curiosity piece.

I really should have put a sm

I really should have put a smiley in my last message ;-)

*sigh*

Yeah, that unification algorithm that was only published in 1965 and is used by any CS undergrad who works through the wizard book, among others, or can be used by anyone working in O'Caml who downloads Wallace, which is now about five years old.

This is my problem: most people who write about types in programming languages who don't come from the modern functional language tradition are so ignorant that they unwittingly spread FUD like this to other programmers. The result is that ignorance breeds ignorance.

Sigh...

Type inferencing in C# 2.0 for generic methods did use unification; it's detailed in the original Gyro document on generics. It may not make sense in 3.0, but if involve some two-phase constraint logic programming that I hear about, I think that would qualify as AI.

As for whether unification is "advanced," I may have went overboard, but I find that I get a lot of mileage out of it in my own AI projects.

Now, I'll go back to spreading fud..

I really wouldn't call basic

I really wouldn't call basic unification AI, however useful for the purpose it is. It's not that far beyond evaluating basic logical propositions, something that again may be useful but isn't AI itself...

...and really, when the basic idea's so simple, why cover it with a label that makes people think "whoa, weird crazy stuff that never does what it's supposed to"?

Wikipedia proves that dynamic typing works

Dynamic typing works period. What accounts for the productivity difference between Java and Python. Basically dynamic typing. Static typing absolutely sucks for imperative languages.

On the other hand every functional static language I see has problems associated with it. Ocaml is way too complex. It has two syntaxes: one for imperative and one for functional. Also the type relationships are very complicated when you start examining the various possibities like paramaterized modules and parameterized classes. It is also true that you must understand the type system very well in order to use a language like Ocaml. Type inference saves typing but it does not reduce the complexity of functional languages. Also type inference does not eliminate the need to deal with the type system. The subtleties of paramterized modules and classes are very difficult to understand and can be extremely tricky. Paramterized modules are considered a feature of Ocaml but they are really a very complicated way of making a static type system more flexible. Essentially you get this flexibility for free in dynamic languages. In dynamic languages you don't have so called 'features' like parameterized modules because you don't have a type system.

In a language like ruby or smalltalk I never have to worry about type relationships since the only relevant question most of the time is does the object respond to the messege you are sending to it. Protyped languages in my view are the future because they have absolutely no type information at all since object in these languages do not belong to any classes. I think the future belongs to languages like Io and Self. Programming languages should be as simple and powerful as possible like smalltalk, io and self. Is there any language which approaches their simplicity. Haskell? No because then I have to learn Monads to do simple things like input and output. Although I do admit that Haskell is superior to Ocaml which is way too complicated.

It took me 3 hours to understand Smalltalk and about a half an hour to understand Self/IO. I still don't understand O'Caml and I spent weeks reading the O'Reilly book. Haskell was pretty sweet and easy to understand but I was stumped at the Monads section. Now I admit I am stupid. So what. I think everything should be made as simple as possible and we generally tolerate way to much unnecessary complexity. I refuse to think unless I absolutely have to. Programming languages should be simple. Is there any functional/static language that is as simple as Io or Self. If there isn't than screw functional languages. Ok ya I forgot to admit that I am also extremely lazy.

"Wikipedia proves" considered harmful

Ah, the democratic approach to truth... ain't it grand? Maybe I'm the only one dopey enough to respond in depth to this, but...

Dynamic typing works period.

Wait, not democratic - I meant authoritarian.

On the other hand every functional static language I see has problems associated with it.

This statement is still true if you omit the words "functional static."

Ocaml is way too complex. It has two syntaxes: one for imperative and one for functional. Also the type relationships are very complicated when you start examining the various possibities like paramaterized modules and parameterized classes.

Those capabilities exist for good reasons - in general to explicitly express structural relationships which hold even if you "choose" to express them only implicitly (via lots o'code). Variables have types (degenerate case: its range of values) even in dynamically typed languages.

Also type inference does not eliminate the need to deal with the type system. The subtleties of paramterized modules and classes are very difficult to understand and can be extremely tricky.

And debugging a heap of code isn't? The type system should serve to express intentions, and enforce decisions - how they do it varies widely in power and style. You either do this explicitly or implicitly, and implicitly is risky.

So far all I get is that Ocaml's type system is hard to understand. Is there more to this?

Paramterized modules are considered a feature of Ocaml but they are really a very complicated way of making a static type system more flexible.

How would you do this more simply?

Essentially you get this flexibility for free in dynamic languages. In dynamic languages you don't have so called 'features' like parameterized modules because you don't have a type system.

You do, you're just choosing not to think too hard about it. In software development, not thinking is generally risky, despite buzz to the contrary.

In a language like ruby or smalltalk I never have to worry about type relationships since the only relevant question most of the time is does the object respond to the messege you are sending to it.

And... that notion doesn't overlap with that of "type" to you?

Protyped languages in my view are the future because they have absolutely no type information at all since object in these languages do not belong to any classes.

They have capabilities which some other instances don't have. Thus sending them messages causes different things to happen. I'd rather find out I've made a mistake at build time.

It took me 3 hours to understand Smalltalk and about a half an hour to understand Self/IO. I still don't understand O'Caml and I spent weeks reading the O'Reilly book. Haskell was pretty sweet and easy to understand but I was stumped at the Monads section. Now I admit I am stupid. So what.

I think everything should be made as simple as possible

And presumably no simpler?

and we generally tolerate way to much unnecessary complexity.

Couldn't agree more. If wading through reams of code with no type information is your idea of simplicity, have at it!

I refuse to think unless I absolutely have to.

I guess it depends on your type?

Programming languages should be simple.

As should everything. To parrot you: So what. :-)

I fail to see

How your ignorance is an argument against static typing.

Baruch Spinoza

Plagiarism! ;-) The above is actually very close to a nice famous quote by Spinoza:

Ignorantia non est argumentum (Ignorance is no argument)

This was said as a counter that all men's actions are predestined by God (since there is no other good conceivable explanation).

Whoops forgot to say why wikipedia proves dynamic typing works

Oops in the previous comment I forgot to explain my title. Essentially I was trying to allude to the fact that the wikipedia is a lot like dynamic typing. In the wikipedia there is no system to make sure that the people who write the articles actually have the credentials to write them. Yet the wikipedia works. The wikipedia does checking after the article is written not before. In other words it is like dynamic typing

On the other hand conventional encyclopedias do static checking since they make sure that the person who is writing the article is qualified to do so before the article is written.

Of course it is obvious that the wikipedia has blown away conventional encyclopedias in terms of productivity.

The question is when do you do the checking before or after. If you do it before then you will always prevent certain articles from getting written that would have been good ones.

With static typing you wouldn

With static typing you wouldn't have made that mistake! ;-)

Proof by popularity!

Essentially I was trying to allude to the fact that the wikipedia is a lot like dynamic typing. In the wikipedia there is no system to make sure that the people who write the articles actually have the credentials to write them. Yet the wikipedia works.

(sound of crickets chirping in the pale moonlight)

The wikipedia does checking after the article is written not before. In other words it is like dynamic typing.

"The Wikipedia" does nothing. People hand-check that. Were you a fan of, for example, punchcards?

On the other hand conventional encyclopedias do static checking since they make sure that the person who is writing the article is qualified to do so before the article is written.

Of course it is obvious that the wikipedia has blown away conventional encyclopedias in terms of productivity.

Productivity measured in terms of numbers of words? I can write a program that would be much more productive than Wikipedia (and in a functional language, too).

So the benefit of late checking rather than late is productivity? Why check at all? Why not extend indefinitely that window of pleasantly uninformed hopes and dreams?

The question is when do you do the checking before or after. If you do it before then you will always prevent certain articles from getting written that would have been good ones.

The benefit of having a computer check is that it can do it quickly - and in that case you're better off doing the checking early, so that you can rely on information you extract from the "database."

Clearly this analogy has been stretched far beyond its endurance...

wikipedia is a bad example

Many wikipedia articles have errors, or are misleading or muddled. for example
There is no guarantee that articles converge on correctness, truth, fairness and the greater persistence of fanatics may prevent that in certain areas.

When I saw the bit about Wikipedia...

...and how it "proves" that dynamic typing "works"; I though that it was an argument along the lines of "See? Wikipedia (more specifically the MediaWiki software on which Wikipedia runs) is a robust, large-scale system implemented using PHP, a dynamically typed language; therefore it constitutes evidence of the suitability of dynamic typing".

Apparently, I gave the author of the above missive far too much credit; he seems to somehow draw an analogy equating wikis with dynamic typing (and peer-reviewed, edited publications such as traditional encyclopedias with static typing). Which gave me a good laugh; it's even funnier (and more fractured) that topmind's favorite analogy of pointers : GOTO :: relational : structured programming.

Given that I think the static/dynamic typing war is both silly and unfortunate (and that therefore I'm morally obliged to defend dynamic typing in forums such as LTU where static typing advocates prevail; likewise I defend static typing in places such as c2), I will add the comment that dynamic typing has proven itself useful; many robust and productive systems have been deployed using it.

Whether it is ideal or not is another matter; one I believe is highly dependent on context. One context where dynamic typing does remain highly useful are applications (such as Wikipedia) where one has to interact with lots of objects (of unknown type) from external agents (often times users, some of them hostile), and/or with other systems over type-lossy interfaces (such as SQL).

But anyway, I like static typing just as much. :) Pity too many languages (by virtue of cleanly supporting only on or the other) tend to make us choose one way or the other up front.

WRT the quality of the articles on Wikipedia: Some of 'em are good, some of 'em are bad; and controversial topics due get bogged down in edit wars as different factions want their side presented as The Trvth. However--if you see an article which a) discusses a topic you are an expert on; and b) is manifestly incorrect, don't just sit there and complain. It's a wiki! Fix the damn article! Click on the "edit this page" tab on the top of the article, and correct the errors! You don't need an account (though if you plan to become a regular editor it's easy to get one), and nobody will ban you or make your life miserable if you make a mistake (all prior revisions are kept of everything, so it's impossible for non-administrators to permanently destroy anything). Wikipedia DEPENDS on contributions from experts to be a comprehensive encyclopedia. It's rather fun, actually.

Enough ranting for today.

fixing wikipedia

I did put a fix in for the article I cited, even though a) I'm not an expert on the topic, b) there are still problems with the page which I lack the energy or desire to try and repair.

To be serious.

Static typing will always be more efficient from an algorithmic point of view, so defending dynamic typing is a sort of losing proposition.

Though from a practical point of view, I guess there isn't any noticeable difference.

Type System Demystified

There is this article that I wrote for the iProgrammer website that
explores the differences between static and dynamic type systems and explains why it is an important issue that a programmer must master.
Basically it started out as when I was trying to make a transition from a dynamic language background/Perl to the .NET platform and C# and I think that it could assist someone that has a dynamic language background such as Perl and wants to know how dynamic programming works in the .NET platform, or for a C# programmer interested in dynamic programming, or for the programmer that wants to know how type systems work. Link: Type System Demystified

Common misconception

In simple words static typing means tying a variable with a type at compile time

I understand the need to simplify things, but that's simply not true. Static typing doesn't require variables (see "point free" programming) and a language can allow the attachment of type-like labels to variables at compile time but be dynamically typed (see Groovy).

I talk about this misconception here:
Groovy Does Not Have Optional Static Typing
.

The most useful compromise...

Even in a dynamically-typed language, you can eliminate most runtime typechecks at the cost of code bloat, by means of fairly straightforward, if time-consuming, inductive analysis. Your language and libraries provide well-behaved functions (in this context meaning functions whose return types are simple pure functions of the argument types) and ill-behaved functions, which in this context means all other functions. Induction can be applied to these sets to determine, in most cases, whether a user-defined function is well-behaved or ill-behaved.

In the presence of fundamentally ill-behaved language and library procedures like Lisp's read and eval, where the return type is NOT a pure function of the argument types, you cannot eliminate all typechecks.

But you can eliminate virtually all typechecks except those corresponding to calls to ill-behaved primitive procedures. And using return specialization, where the function returns to a different point depending on the type of its return value, you can move those typechecks into the ill-behaved procedures themselves.

What you cannot eliminate is typechecks after calls to procedures whose identity (or type) you just don't know at compile time. Fortunately these calls are fairly rare, but in any dynamic language, they exist. If you can't completely analyze the program's behavior, which is tantamount to running it at compile time, you just have to assume that such calls are calls to ill-behaved procedures.

Incompatible with separate compilation

What you cannot eliminate is typechecks after calls to procedures whose identity (or type) you just don't know at compile time. Fortunately these calls are fairly rare

Um. With any sane notion of modularity and separate compilation, calls to procedures not known at compile time are the norm.

That's true.

That's true. Maximizing type inference, like maximizing flow analysis, is a whole-program optimization.

Hmm

Here's a function:


(define (f x)
(+ 1 (car (cdr x))))

This code calls no unknown procedures. How do we eliminate runtime typechecks?

Start by computing what type

Start by computing what type the function argument has to be. Since (+ 1 (car (cdr x))) will result in an error unless its second argument is a number, the expression (car (cdr x)) must have a numeric type.

Since (car (cdr x)) will result in an error unless (cdr x) is a pair,
the expression (cdr x) must return a pair.

Since (cdr x) will also result in an error unless x is a pair, the expression (in this case a variable reference) x must return a pair.

Put these facts together, and you know the type of f: it has type pairs-whose-cars-are-pairs-whose-cdrs-are-numbers-to-numbers. Internally, you generate gnarly-type-expression which says that, in whatever type calculus you're using, remember it as the type of f, and check calls to f against it.

Now you output at least one machine-code vector for f: fa which does no typechecking. Depending on whether the user is asking for strict compile-time typechecking or not, you may also emit fb which does runtime typechecking, but your objective is to minimize calls to fb.

Later on, if you see a call like (f 5) or something, you signal a type error at compile time because you can prove that the expression 5 is not a member of type gnarly-type-expression . In the more usual case of a call with a variable name like (f x) you will have to do flow analysis to find the type, at the call site, of the value in the variable x.

If you see a call where you can prove that the expression supplied as an argument to f is a member of the type gnarly-type-expression you call fa.

And, if you see a call to f where you cannot figure out whether the argument satisfies gnarly-type-expression, you either compile a call to fb for runtime checks, or signal an error, depending on how strictly the user asked you to typecheck his code.

It's fairly straightforward Hindley-Miller type inference, and it can be applied to latently-typed languages. The main difference, in fact, is not figuring out the types of procedures and code subexpressions. The main difference is that you have to transform the code into SSA form before you have well-typed variables for the Hindley-Miller algorithm to work with. It's a degenerate case, but a static-single-assignment variable has a definite type in Hindley-Miller, because whatever value is stored in it must have a definite type, and there is only one value ever stored in it.

In the presence of procedures which are not well-behaved with respect to type, like read and eval , you will not be able to completely eliminate runtime typechecks from all programs, but you can come pretty close.

For a worked example of a lisp system that does this, look at a scheme implementation called Stalin, written by Jeffrey Mark Siskind.

Old article

This is a fairly old article. There's been a lot of work since then on combining static and dynamic typing, much of it based on Wadler's paper: 'well-typed programs can't be blamed'.

Recent work

I don't think this statement is correct, and it might make a few people unhappy. Most of the more recent work on gradual/hybrid/etc typing has been independent of, and prior to, the Wadler & Findler paper.

Recent Work

Your summary of the recent work is incorrect, except to say that there's been lots of work in this area. The beginning of recent interest in a non-speculative sense goes back to Siek and Taha, Scheme Workshop 2006, and Tobin-Hochstadt and Felleisen, DLS 2006.