Buried Treasure - No Fluff Just Stuff

Buried Treasure pdf

The trend toward dynamically typed languages is both widespread and strong. Less obvious, though, is a resurgence of interest in functional programming and functional languages. ... Haskell ... Objective CAML ... Scala ...

Is it double-think that allows the promotion of both dynamically typed languages and, by name, Haskell OCAML Scala without any acknowledgement that those powerful and efficient functional languages are ... statically type checked?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Not so sure about that

...the adoption of dynamically typed languages is widespread and strong

The whole paper smells of RoR hype more than anything else; seeming to suggest that RoR is driving this "strong and widespread adoption". Python, Smalltalk, Lisp, and other dynamically typed languages weren't on the verge of death before Ruby.

The trend I see is the realization that boilerplate sucks and reading redundant code sucks. Some people think statically-typed languages, and the first languages that pop into their head is C++, Java, C#... They like the loosey-goosy feel of duck-typing and not having to declare types.

But as we know, type-inference, structural subtyping, and having a toplevel to play with can mitigate some of these weaknesses of statically typed languages.

I've really been enjoying playing around with Nemerle lately. It seems a nice blend of OO and functional programming, with type inference, macros, variants/pattern-matching, open classes, optional python-like indentation syntax, along with the ability to use the .NET libraries. In fact, C# 3.0 seems to be going in this direction.

I guess Scala would be something similiar for the Java platform, but I don't know much about it.

not quite

But as we know, type-inference, structural subtyping, and having a toplevel to play with can mitigate some of these weaknesses of statically typed languages.

Some of the weaknesses may be taken care of, but that certainly doesn't mean such a language would be as flexible as a dynamically typed one. I've found that such principles can be applied from the other direction and yield a much more favorable result. Letting the programmer do as he wishes, performing error checking at compile time when sensible, and using an editor which actually tells you information about what's going on (e.g. types) pretty well negate most, if not all, of dynamic typing's disadvantages.

*sigh*

Apart from the irreducible fact that static typing proves the absence of classes of misbehavior before delivering the software, whereas dynamic typing can prove no such thing, leaving everything to testing, which, per Edsger Dijkstra, can only prove the presence of defects, not their absence. This distinction is only becoming more indicative of static typing's value over time, rather than less, as we see with the work on type system support for concurrency, distribution, and security: Software Transactional Memory, TyPiCal, HashCaml, Acute, Flow Caml... all of these represent statically-typed solutions to problems such as type-safe marshalling, proving the absence of deadlock and race conditions, and proving code adherence to security policies, all before the code gets to the customer, and all of which are quite literally impossible to provide a dynamically-typed analog for. Proving, as in, the code doesn't compile if it's wrong. Proving, as in: there's no need to test those features (of course, you still need to test features that can only be realized by the dynamic semantics of the language in question).

At the moment, for most software, the distinction probably isn't that important, and I've acknowledged this before. But let's be clear: static typing is becoming more important, not less, as software is becoming more concurrent, more distributed, and (it is devoutly to be hoped) more secure, not less. If the future looks grim for anything, it's dynamic typing, which is really best suited for exploratory programming in poorly-understood domains with small teams of genius-level programmers (per Marvin Minsky). "Letting the programmer do as he wishes" is a great policy for a single person working on a pet project, or a small team of genius programmers. I can tell you from about a quarter century of industry experience, it's an unmitigated disaster for the industry.

purity

Apart from the irreducible fact that static typing proves the absence of classes of misbehavior before delivering the software, whereas dynamic typing can prove no such thing, leaving everything to testing, which, per Edsger Dijkstra, can only prove the presence of defects, not their absence.

For pure dynamic typing this may be true. What I'm proposing is somewhat of a hybrid--those certain classes of errors you speak of are, for the most part, handled at compile time. The main difference is, of course, that explicit casting is not required, which is easily made up for with a decent editor which highlights possibly erroneous code. If you consider that, it comes down to one thing: who does the work. With static typing, explicit casts help notify yourself of the possibility of a runtime error; with dynamic typing, the editor does this for you.

proving the absence of deadlock and race conditions, and proving code adherence to security policies, all before the code gets to the customer, and all of which are quite literally impossible to provide a dynamically-typed analog for.

I suppose static typing allows for solving the halting problem, as well? ;-) I would like to see examples of these; I doubt they're something that couldn't be done with a hybrid system.

"Letting the programmer do as he wishes" is a great policy for a single person working on a pet project, or a small team of genius programmers. I can tell you from about a quarter century of industry experience, it's an unmitigated disaster for the industry.

The only circumstance I can imagine where dynamic typing with a decent development environment is that bad is when the programmers on said project aren't too knowledgeable in their profession, in which case you're already screwed if you try to get them to do anything moderately hard, regardless of the type system.

This Always Gets Confusing

Curtis W: If you consider that, it comes down to one thing: who does the work. With static typing, explicit casts help notify yourself of the possibility of a runtime error; with dynamic typing, the editor does this for you.

If the editor can infer the type and let me know of it, why shouldn't/wouldn't the compiler do the same thing? This is one thing I've really never understood about the dynamic typing argument: it frequently seems to come down to "let's do type inference somewhere other than the compiler." The only putative value I can see to that is if what's implicitly meant is "and let's not report an error on anything that can't be inferred," which seems to me to be worse than useless: the casual programmer is lulled into a false sense of security because the system doesn't tell them they might be talking nonsense, vs. the static typing case, which is admittedly conservative because it sometimes rejects valid programs. I think this may inform a later comment of yours, so more in a moment.

Curtis W: I suppose static typing allows for solving the halting problem, as well? ;-) I would like to see examples of these; I doubt they're something that couldn't be done with a hybrid system.

Boy, you must be new here. :-)

Acute deals with type-safe marshaling, dynamic type-and-version-safe upgrade, and several other features.

HashCaml follows up on Acute, specifically in type-safe marshaling, dealing with several constructs that Acute did not.

TyPiCal is the Typed Pi Calculus system, statically preventing deadlock and race conditions and also providing information flow analysis.

Flow Caml is all about information flow analysis, adhering to a security policy, etc.

Sure, you could implement these in a hybrid system of some kind—it's just that they'd all fall on the static side of such a hybrid, with the notable exception being that there's work to be done in the information flow realm to allow policy statements to be values as well as types.

So ultimately, yeah, we all want some kind of integration of static and dynamic typing so that we can choose our acceptable risk level in our code. Partial evaluation, dependent type systems, Ontic-like semantics... I honestly don't know which of these, or whether some combination, will yield the most value, nor do I have a really good handle on how to do any of these things (yet—I suspect that finishing working through TAPL and ATTAPL will help a lot). But I do know that the big challenges in programming today revolve around concurrency, distribution, and security, and am confident that those challenges are only completely addressable with a very rich type system, probably involving dependent types or Ontic-style semantics. I also think, though, that such a system needs to be inferred and needs to be incremental, so that it can be embedded in an IDE and stay out of the way as much as humanly possible, so that programmers can reap the benefits of interactive, incremental development and not confuse "static typing" with "declare everything, batch everything" development.

That's what I think the original comment from the article was getting at, to try to bring all of this home: it's perfectly possible to have interactive, incremental development with a statically-typed language, as I do in all of my O'Caml work. Lots of folks think that's the exclusive province of the "scripting" language or dynamically-typed languages like Lisp/Scheme, but it isn't. I think if that word gets out more, we'll see more people moving away from dynamically-typed languages to friendly statically-typed languages, rather than vice-versa. Then, hopefully, something like Tim Sweeney's ideal language will appear, and much of this debate will really become moot!

Curtis W: The only circumstance I can imagine where dynamic typing with a decent development environment is that bad is when the programmers on said project aren't too knowledgeable in their profession, in which case you're already screwed if you try to get them to do anything moderately hard, regardless of the type system.

Like it or not, most programming teams in the world aren't A-list, so yes, you really do have to think about basic competency issues. "It's not that bad in the real world" is the constant canard argument that dynamic typing fans make, and it's simply false. It's based on the inaccurate premises that all programmers are good programmers and that there's no basic difference, really, between a tool that proves the absence of undesirable properties before delivery and one that doesn't, especially when those properties are among the most complex ones we deal with in software. Even if you assume every team is an A-list team, very bright, very capable, very experienced programmers screw up shared-state concurrency all the time—it's the norm, not the exception. The situation with distribution and, especially, security is even worse. Having a compiler that rejects code that will deadlock or has race conditions, or that emits code that's at least guaranteed to throw an exception if I attempt to do type-incompatible marshaling/unmarshaling, or rejects code that violates a security policy, means that I can concentrate more of my creative energies on solving the actual problem at hand, instead of worrying about things that the computer can worry about for me. That's unbelievably valuable, and again, it's pretty obviously where things are going over the long haul—yes, the features that right now are only found in exotic-seeming languages need to be mainstreamed, careful attention has to be paid to syntax and tool support and all of the other i's and t's that go into making a real industrial language—but make no mistake, this is how the tough programming issues will be solved. After all, all I'm saying, in the end, is something like "all sufficiently complex programming issues are ultimately resolved through metalinguistic abstraction," and once a few meta-solutions are proven over time, they stop being meta and simply become new languages.

Incremental/Interactive development

also think, though, that such a system needs to be inferred and needs to be incremental, so that it can be embedded in an IDE and stay out of the way as much as humanly possible, so that programmers can reap the benefits of interactive, incremental development and not confuse "static typing" with "declare everything, batch everything" development.

That's what I think the original comment from the article was getting at, to try to bring all of this home: it's perfectly possible to have interactive, incremental development with a statically-typed language, as I do in all of my O'Caml work

I know you're a big OCaml fan, so I was wondering if you had tried out F#? The F# addin for Visual Studio is very sweet. You get all the intellisense stuff, plus you can just write down any snippet of code you want in the editor, hit ALT-; for a one liner and have it compiled and evaluated, or highlight a block of code and hit ALT-Enter and have it compiled and evaluated.

Or you can just go down to the REPL window and type out some code and ;; to get it evaluated.

Wrong Platform :-)

If I were a Windows guy, I'd certainly take a look at F#. :-) I do appreciate the functionality you're describing—it's just that I appreciate it with O'Caml and Tuareg Mode in GNU EMACS.

If the editor can infer the

If the editor can infer the type and let me know of it, why shouldn't/wouldn't the compiler do the same thing?

Because the editor can notify you better. Which would you prefer, highlighted warnings with popup explanations before you compile, or a terse text log which swamps you with all the warnings at once?

The only putative value I can see to that is if what's implicitly meant is "and let's not report an error on anything that can't be inferred," which seems to me to be worse than useless: the casual programmer is lulled into a false sense of security because the system doesn't tell them they might be talking nonsense, vs. the static typing case, which is admittedly conservative because it sometimes rejects valid programs.

What, you've never heard of warnings before? It's not as though the editor is going to go through all that work to throw the information it got away. In any event, consider the differences that are left when comparing a hybrid dynamically typed system with a structurally subtyped, type inferred static system. The only difference is in the handling of possibly erroneous situations. In dynamic typing, the editor alerts you to this fact, whereas in a static system you have to notify yourself with a cast. Furthermore, this doesn't even guarantee correctness; if, at run time, the given object isn't compatible, you've got an error on your hands that might've gone unchecked.

More False Dichotomies to Slay :-)

Curtis W: Because the editor can notify you better. Which would you prefer, highlighted warnings with popup explanations before you compile, or a terse text log which swamps you with all the warnings at once?

This is exactly why I urge the development of incremental compilers: it's not a batch world anymore, and hasn't been for about a decade. Modern machines spend the overwhelming majority of their time waiting for their users—time that could be productively spent dealing with a developer's actions in real-time. There are incremental parsing algorithms and incremental type-inference algorithms for some type systems (typically those with "principal typings" vs. "principal types," surely a confusing state of affairs). So what's really needed isn't even strictly in the realm of language design, but rather in better compiler stage separation and the provision of an event-driven interface. The point, of course, is that this has nothing whatsoever to do with being either statically- or dynamically-typed. I almost get this with O'Caml and Tuareg Mode, but not quite: the "show type at point" feature of Tuareg Mode only works if the code in question has been compiled with the -dtypes option, i.e. it's still a batch process. That's something the O'Caml team could address, however, if they chose to.

Curtis W: What, you've never heard of warnings before? It's not as though the editor is going to go through all that work to throw the information it got away.

Nor would the compiler, but remember, I'm talking about that which can't be inferred in even the most powerful known type systems. After all, that's the real advantage of dynamic typing; it allows a programmer to express things that intuition says are safe (or at least the risk of runtime failure is acceptable) that all known static type systems would reject.

Curtis W: In any event, consider the differences that are left when comparing a hybrid dynamically typed system with a structurally subtyped, type inferred static system. The only difference is in the handling of possibly erroneous situations. In dynamic typing, the editor alerts you to this fact...

OK, I'll bite: how? Is the editor going to do type inference (let's keep it simple and say "in the Milner-Damas system like the ML family") and then tell you about each and every term that it can't type that way? Yikes. I have to be honest: I use such a type-inferring-but-warning compiler. It's called Steel Bank Common Lisp, and I pay almost no attention to its warnings because they are so voluminous and, um, underinformative. I worry that this would be the case for any such language, whose expressive power far outstrips the power of its type system.

Curtis W: ...whereas in a static system you have to notify yourself with a cast.

Except in actual statically typed languages where unsafe (that is, down) casts don't exist. :-)

Curtis W: Furthermore, this doesn't even guarantee correctness; if, at run time, the given object isn't compatible, you've got an error on your hands that might've gone unchecked.

Correct—that's exactly why languages such as O'Caml don't have such casts.

I'd be interested to know

I'd be interested to know what "intuitively safe" code you have that isn't typeable under, say, the Calculus of Inductive Constructions. Granted it's not particularly pragmatic as a programming language...

Of Course...

...I don't have any: I don't trust my intuition as much as I trust even Milner-Damas, which is a big reason why I now do all my recreational programming in O'Caml. :-) But the argument for dynamic typing I've heard repeatedly is that the programmer knows what he/she is doing and the type system is just "getting in the way." I've not seen any non-contrived examples—the one that generally is offered is heterogeneous lists, which first of all is a mechanism, not a use-case in a domain; and secondly is pretty easy to implement in Standard ML, O'Caml, or Haskell.

I can find plenty of

I can find plenty of examples that I can't type in a language that's production-ready, although most of the really fun ones . I picked the CIC for a reason though - if I can't type it given pretty much the entirety of constructive mathematics to work with, I'm not sure I should be trusting my intuition (if I have a firmer reason for it, that's different - systems that aren't supposed to terminate, for example).

A dynamic typing thingy

Here's a non-trivial use of dynamic typing from my work.

The platform is HTML+JavaScript. The application is a touchscreen interface for a kiosk. The interface naturally splits into things called "features" that are organized into "pages". Example feature: "screensaver kicks in after n minutes of inactivity". Another example feature: "the back button leads somewhere". Another example feature: "a list of songs is displayed on the screen". Different pages are just different combinations of features.

The common interface of all features consists of 2 methods: show and hide. Hide has no arguments. Show has one argument, the type of which varies from feature to feature. Example: the back button's argument is the name of the page the back button leads to. The screensaver's argument is the inactivity time. And so on.

All features and all pages have names. The data structure that describes the whole interface can be schematically described as: map<pageName, map<featureName, argument>>.

There might be a way to statically type this, but the DT solution works just fine :-)

Closures to the rescue

Vladimir Slepnev: map<pageName, map<featureName, argument>>

Of course, the obvious way to circumvent this (and actually the more natural solution IMHO) is to use closures and just have a map<pageName, unit->T> (where T is whatever return type the show function has). Each closure just applies the show function from the respective feature to its argument.

DRY

This violates the "don't repeat yourself" rule :-)

What's repeated?

How is that?

Aah

Sorry, I was a bit careless.

You must've meant map<pageName, set<unit->T>>. This doesn't violate DRY, but it also doesn't contain all the required information. We don't know which "hide" functions to call - only "show". And if we fix it in the obvious way, we'll start to violate DRY :-)

Mh...

You are right, I missed that it should be a set. Sorry for that.

But I still don't believe there is a problem. You need hide as well. Fine: then store a set of pairs of closures instead - in other words, a set of objects, each having a show and a hide method (both without arguments!). The show argument becomes a constructor argument for the respective feature classes (or functions creating features). This should be no more verbose to express.

One advantage of this solution is that it scales to more heterogeneous cases, where not all features by accident happen to have exactly one argument for show. ;-)

OK

Aah, I see.

bind :: Feature<T> → T → BoundFeature

map<pageName, set<BoundFeature>>

Is this what you meant? Good! (Sorry for the crazy syntax; I'm just trying to be understandable.)

As for scaling, I have no problem supplying a 2-element array as an argument to show. Dynamic typing, you know :-)

Almost what I meant

...except that I wouldn't even bother with the bind function, and instead always construct appropriate BoundFeatures right away. What would you ever need a Feature<T> value for?

To pass different arguments

You might say that a Feature<T> is a constructor :-)

Another observation

I sense an interesting trend here. Your statically-typed solution is composed from values with matching types. My dynamically-typed solution, OTOH, contains names of stuff, not stuff itself.

I like names. It seems to me that dynamically-typed systems encourage giving names to stuff. This may be one of the core reasons I like DT :-)

The Web likes names, too; and it's dynamically typed - an URL may point to anything.

Naming stuff

Unless I'm missing something obvious, the DT solution doesn't name the type, just the names of the containers. ST gives can name both the type and the containers. So if you like naming....

Hm

That's a different meaning of "naming" :-) My "naming" maps Strings to Things. Let me explain.

To my subjective eye, types in ST languages don't even have names. Let's say we want to read a class name from a config file and default-construct an object of this class (say, a user plugin to our software). It's easy in Java: we have Class.forName(String). Even easier in Ruby. Note the user didn't have to "register" stuff somewhere, or otherwise worry about naming; it already exists.

Another fun topic is RMI naming (java.rmi.Naming). I can bind any object to any name, then call it from another machine. Remote calls are one of those cases where you absolutely need dynamic typing. Remote calls can't be verified statically, period. (Compile the client with an old version of the server interface, then use it with the new version. Crash.)

I like the freedom to name and look up stuff that DT languages give me. YMMV.

Well, Alice has it's packages

that provides for remote components and calls, so I'm sure Andreas can clues us in about how to accomplishes this within the framework of static langauges (link to Andreas paper about The Missing Link - Dynamic Components for ML.

Runtime type information

It's easy in Java: we have Class.forName(String).
...
I like the freedom to name and look up stuff that DT languages give me. YMMV.

Despite the holes in its type system, Java is ST by any normal definition, so the use of Java as an example of a language in which this is easy illustrates that this is not an ST issue. C++ also supports access to runtime type information (RTTI).

The issue is that in a language that does significant static processing, there are certain kinds of names that are only needed by the language implementation itself at compile time. Type names in particular are a common example of this. The choice to keep information about types at runtime is then completely optional, and some languages choose to keep it around, whereas others don't.

ST languages often drop that information by default, on the principle that programs should only pay for the features they use. But there's nothing intrinsic about ST languages that prevents having runtime type information, and there's even nothing intrinsic about DT languages that requires having that information (other than the type tags necessary for runtime typechecking). The same goes for having runtime access to other kinds of names, such as the names of lexical variables.

Remote calls

Remote calls can't be verified statically, period. (Compile the client with an old version of the server interface, then use it with the new version. Crash.)

Depends. You can dynamically check types - once! - before you perform the first call, i.e. when you establish the connection. Then the calls themselves can be statically type-safe. Since Chris mentioned Alice ML: this is what it does with its proxy mechanism.

So yes, you need a small amount of dynamic type checking for open distributed programming, but it's much smaller than you think - and it does not preclude having a statically typed language.

Regarding names: there are symbolic names and there are logical names (ids). I suppose you mean the former. Problem with them is that they always include the potential for name clashes.

Naturally...

...I was going to point out that type-safe distribution is exactly what Acute and HashCaml do, so "Remote calls can't be verified statically, period" is rather obviously too strong!

Well...

AFAIK, Acute's inter-process communication consists only of IO.send/receive, and those actually perform a dynamic type check for every single transmission. Similarly with HashCaml. So I'm afraid they are not really proper counter examples to Vladimir's argument that type-safety cannot be ensured statically for inter-process I/O.

Senses of "Static"

It depends upon how you define your terms. If, in fact, I can't crash the runtime by using an old version of a client against a new version of a server; if, in fact, attempting to compile/link old and new versions of interfaces and their implementation fails; if, in fact, I'm guaranteed to get an exception when the distributed code mismatches, I seem to have landed at a very important sweet spot with respect to both the static case (compile/linking of incompatible services fails) and the dynamic case (the code over the network is guaranteed not to "go wrong," i.e. end up in an undefined state). That certainly addresses the "old version + new version = crash" concern, which is all one can ask for.

Linking vs IPC

I agree with everything you say, except for the last half-sentence ;-). Because dynamic linking and inter-process communication are usually different things, operating on a different level of granularity. And for the latter I indeed ask for more than per-transmission checks (which just form the distributed counterpart to a dynamically checked language).

On second thought, given that the context of your post was IPC alone and Vladimir's specific claim, I also disagree with your first sentence. :-)

What about MiniKanren in Haskell?

I'm able reproduce MiniKanren in Haskell if I restrict the values the logic system deals with to a single ADT, but this isn't very satisfactory. It's nothing like programming in MiniKanren on top of Scheme. I don't know how to produce a logic system in Haskell that feels like a typed MiniKanren.

Now, a logical system can be typechecked. I bet a not-too-terribly complicated preprocessor could feel like "MiniKanren on Haskell." However, this is all way more complicated than what Scheme offers.

Perhaps any decidable type system prohibits useful means of expression... even though those means of expressions could be typed by a different, decidable, type system. Just a possibility running through my head.

Kanren was influenced

Kanren was influenced (inspired?) by the work of Spivey and Seres on emebdding logic programming in Hasekll, so I suggest you look at this work if this direction interests you (start here and follow the links).

The point, of course, is

The point, of course, is that this has nothing whatsoever to do with being either statically- or dynamically-typed.

It most certainly does. I would go so far to say that in any structurally subtyped language it's going to be a bit more difficult to program in compared to a nominally typed language without help from the programming environment.

Nor would the compiler, but remember, I'm talking about that which can't be inferred in even the most powerful known type systems.

I'm not sure what you mean. The type of any statement can be inferred with the presence of compound types; it just means that some processing needs to be done at run time, which would have to be done manually by the programmer had the compiler not done it.

OK, I'll bite: how? Is the editor going to do type inference (let's keep it simple and say "in the Milner-Damas system like the ML family") and then tell you about each and every term that it can't type that way? Yikes.

Still not sure what you mean. The basic idea is to infer the types and check for the situations which might cause a run time error due to incompatible types. Syntax errors and certain other errors, e.g. guaranteed erroneous code, would be handled at compile time, as well.

Correct—that's exactly why languages such as O'Caml don't have such casts.

Ah, yes. I'd forgotten about forced checks. They might satisfy the compiler, but the presence of a check doesn't necessarily mean it does any good, as can be seen in Java exceptions. Because of this, it can't really be considered a benefit. Once you remove that, it decays into the implicit vs explicit casting argument, which I've already addressed.

Feels like it'll explode

Nor would the compiler, but remember, I'm talking about that which can't be inferred in even the most powerful known type systems.

I'm not sure what you mean. The type of any statement can be inferred with the presence of compound types; it just means that some processing needs to be done at run time, which would have to be done manually by the programmer had the compiler not done it.

Though the inferencer can always give you a "safe" answer, the precision of the type may be too poor to allow the kinds of static checks you'd like to perform. A statically typed language enforces a certain amount of modularity and abstraction in order to ensure that it will always be able to prove the absence of a certain set of errors.

You seem to be of the opinion that explicit casts are useless. They aren't; they provide a static checker with information about the programmer's intent. Without them, a checker would warn about *every* mismatched assignment statement. An explicit cast is the programmer saying: "I know these don't match up, but do it anyway". Of course, if you have to insert explicit casts all over the place (as with pre-generics Java), then there's something wrong with the type system.

OK, I'll bite: how? Is the editor going to do type inference (let's keep it simple and say "in the Milner-Damas system like the ML family") and then tell you about each and every term that it can't type that way? Yikes.

Still not sure what you mean. The basic idea is to infer the types and check for the situations which might cause a run time error due to incompatible types. Syntax errors and certain other errors, e.g. guaranteed erroneous code, would be handled at compile time, as well.

I don't think you can write whatever you want and expect a sound static checker verify the properties you're interested in without exploding. Change one line and the checker may suddenly be unable to make any useful assertions about a large chunk of your program. The restrictions imposed by a statically typed language are present primarily to avoid this situation.

Sure, static typing is often more difficult but I think the benefits are worth it. Do you have any particular examples of things that are unreasonably difficult in a modern static type system? What about languages that have a dynamic invocation "escape hatch", like VB.Net and Boo?

Though the inferencer can

Though the inferencer can always give you a "safe" answer, the precision of the type may be too poor to allow the kinds of static checks you'd like to perform.

The precision of the type depends on how many different types the variable can hold. I fail to see how this would not be useful when performing static checks.

A statically typed language enforces a certain amount of modularity and abstraction in order to ensure that it will always be able to prove the absence of a certain set of errors.

This has nothing to do with static/dynamic and everything to do with inferred/explicit typing.

You seem to be of the opinion that explicit casts are useless. They aren't; they provide a static checker with information about the programmer's intent.

The problem being, the compiler can usually guess my intent pretty well by how I use the variable.

Without them, a checker would warn about *every* mismatched assignment statement. An explicit cast is the programmer saying: "I know these don't match up, but do it anyway". Of course, if you have to insert explicit casts all over the place (as with pre-generics Java), then there's something wrong with the type system.

At the beginning of this paragraph you talk about how highlighting every single mismatch would simply swamp the programmer. Of course, you realize that the places where the editor would highlight mismatches are the same where explicit casts would be required otherwise, right?

I don't think you can write whatever you want and expect a sound static checker verify the properties you're interested in without exploding. Change one line and the checker may suddenly be unable to make any useful assertions about a large chunk of your program. The restrictions imposed by a statically typed language are present primarily to avoid this situation.

Ah, but you see, it's not that you can't give the compiler extra information with a dynamic language, it's that it's optional. Most of the time, you don't need to use it; if you do, however, it's there for you.

Sure, static typing is often more difficult but I think the benefits are worth it. Do you have any particular examples of things that are unreasonably difficult in a modern static type system?

Not necessarily difficult, but certainly cumbersome.

edit: terminology

A quick nitpick: inferred vs

A quick nitpick: inferred vs nominal isn't a relevant dichotomy. You might be looking at one or both of nominal vs structural (the answer to which is: both!) or inferred vs explicit/manifest.

Whoops. My mistake, I meant

Whoops. My mistake, I meant explicit :-)

At the beginning of this

A statically typed language enforces a certain amount of modularity and abstraction in order to ensure that it will always be able to prove the absence of a certain set of errors.

This has nothing to do with static/dynamic and everything to do with inferred/explicit typing.

A statically typed language is designed with the type checking algorithm in mind. Even systems with type inference don't allow you to write whatever you want because the language designers knew they couldn't write a type checker to handle it.

When you design a language without any direct consideration for static verification (which is, I believe, how dynamic languages are designed), it's probably not going to be easy to check. I'm guessing you'll end up with a time or state space explosion.

BTW, I think our definitions of static vs. dynamic may not match up and we may be talking past each other here. What do you consider to be the difference between a statically typed language with type inference and a dynamically typed language?

At the beginning of this paragraph you talk about how highlighting every single mismatch would simply swamp the programmer. Of course, you realize that the places where the editor would highlight mismatches are the same where explicit casts would be required otherwise, right?

Yes. Maybe "swamp" was not the right word :) An explicit cast is a way for the programmer to tell the compiler (and other readers of the code): "yes, these types aren't implicitly compatible, but I know what I'm doing". Without them, the editor/compiler would always show you every type mismatch. Explicit casts are a way to suppress false positives.

When you design a language

When you design a language without any direct consideration for static verification (which is, I believe, how dynamic languages are designed), it's probably not going to be easy to check.

It's not going to be easy, but that's because you're moving something that's normally done by the programmer into the compiler/editor. To be honest, I'd rather have the development environment be more complex if it allows me to write more concise, flexible code.

Yes. Maybe "swamp" was not the right word :) An explicit cast is a way for the programmer to tell the compiler (and other readers of the code): "yes, these types aren't implicitly compatible, but I know what I'm doing". Without them, the editor/compiler would always show you every type mismatch. Explicit casts are a way to suppress false positives.

The way I see it, highlighting and explicit casting have the save visual impact, the only difference being the the latter is done manually by the programmer whereas the former is done automatically by the editor.

And in having been done

And in having been done manually by the programmer, a cast shows intent - "I intend to treat this String I read in as a Name", for example.

Sort of. It does show

Sort of. It does show intent, but it doesn't add anything that couldn't be discerned by the source code or types, e.g:

type Name = [char]

name = Name(in.readline())

adds nothing over

name = in.readline()

In our ideal editor, the latter's type will obviously be determined to be [char] and if it weren't it would be easily noticeable by the programmer.

A fan of

A fan of pointless^Wpoint-free style might not bother binding the value to a variable called name though. What then?

You look at how it's used?

You look at how it's used.

Which puts you back to

Which puts you back to attempting to assign intent when you don't actually know. Next thing you know, you're forced to conclude that either you're getting it wrong or that the programmer actually intended to dereference that null...

Wait, what?

Wait, what? I'm thinking this is a somewhat bad example. From what I can tell, a Name is just a string, so I see no reason why you actually need to label it a Name. If a Name actually has some other properties, it's time to create a new time.

Even if it's just a string...

there are reasons to wrap it. Documentation first and foremost, but also future-proofing and encapsulation of published interfaces. In high-risk environments with long depreciation cycles, that sort of things is often worthwhile (particularly if a refactoring browser will do the encapsulation for you).

I had a "fun" case with page

I had a "fun" case with page names in my wiki clone. One of those lessons you hopefully only need learn once...

Future-proofing? Tell me,

Future-proofing? Tell me, how are you going to future proof a string? Wrap it in get/set methods and call it "encapsulated"? Future-proofing implies the presence of properties other than just being a string, which is an entirely different situation. If you want an example of this, consider a Name which consists of any alphanumeric string:

alphanumeric = regex("[a-zA-Z0-9]+").match
type Name = [char](alphanumeric)

In this case, it makes sense to use the type in, e.g., argument lists.

It's not an entirely

It's not an entirely different situation, because in practice there aren't any strings with no additional properties - just times where you don't care what they are.

As soon as the contents of

As soon as the contents of strings have specific meanings, it can make sense to assign more specific types than simply String - it can stop you mixing up term- and type-level identifiers in a typechecker, for example.

Not quite

The latter doesn't show intent, it shows either intent or programmer error. Could be either one, and there's no way to determine which without examining a lot of context. The (admitted) redundancy and verbosity of the first carries an implicit "Yeah, I really meant this" which the second just doesn't have. Fine if you and your team don't make many errors, but I'm just not that good.

Intent includes programmer

Intent includes programmer error.

I'm confused...

Did you just define intent to include carelessness, ignorance, and confusion (the root causes of most programmer error)? If so, you've probably stretched the word far beyond what most of the participants here are used to.

Did you just define intent

Did you just define intent to include carelessness, ignorance, and confusion (the root causes of most programmer error)?

That's what I said, yes. Consider a casting example, which supposedly only shows "intent":

char* zero = (char*)0;
puts(zero);

This obviously shows that the programmer is fully aware that he's casting an int to a char--intent, but it also shows the misconception that said operation will return a string "0". Obviously this is a blatent error, an error, unknowingly or not, that shows intent.

And it shows intent

And it shows intent regarding char * - the problem is that char * may or may not refer to an array and thus a string. The problem is that the type system has left the declaration of intent ambiguous.

Programmer error?

Well, breaking it down into a simplistic two camps framework, we have:

  • What the customer thinks he needs
  • What the programmer thinks the customer needs
  • What the programmer thinks the program does
Any one of these three things can be in error. The customer doesn't really comprehend what they need. The programmer has very little inkling of what the customer wants. And the programmer doesn't fully understand how the software works. But they are three different classes of errors. I find it silly to say that because the first two are very hard to get right, that we shouldn't worry about the third class of errors. If the programmer can't get the program right, it's pretty irrelevent whether the customer and programmers are in complete agreement about the task.

So, yes, the programmer may have an error in undertstanding the requirements. But once the programmer decides upon a course of action, then that programmer dang well ought to know how the software meets their conception of the requirement. Adding an extra layer of uncertainty doesn't make things more flexible.

As a programmer, I generally classify bugs as those things that a program does that I can not understand or explain - i.e. the program acts unexpectedly. Of course, the customer may consider things they do not expect as bugs, but that is a different story.

[Edit Note: This response is probably only loosely related to the current thread. Just been meaning to express my opinion that the belief that because customer specs are so fluid, that program verification is not helpful].

Point of Divergence

Curtis W: It's not going to be easy, but that's because you're moving something that's normally done by the programmer into the compiler/editor. To be honest, I'd rather have the development environment be more complex if it allows me to write more concise, flexible code.

This is where I think we're talking past each other: dynamic languages aren't "moving something that's normally done by the programmer into the compiler/editor." Quite the opposite! Typically (and I really should write "without exception that I'm aware of," because it's true of at least Scheme, Common Lisp, Perl, Python, Ruby, and TCL), dynamically-typed languages make it extremely difficult to do type inference even in a relatively simple type system such as Milner-Damas, let alone a richer type system such as, e.g. GHC 6.4's with GADTs and so forth. The type checking generally takes place in the programmer's head and with what checking you get at runtime, and I don't know of any runtime checking in any of those languages that's getting anywhere near what, e.g. GADTs do.

So the assertion that you can have a dynamic language with an external editor that somehow "shows you where you'd need casts" is a nice idea in theory, but the only practical example of it I'm aware of is DrScheme's MrFlow, which seems to be perpetually "coming soon," although once we had MrSpidey. It would be nice to compare MrFlow's inferred types with the types inferrable by GHC 6.4 or later, to try to get to a reasonable apples-to-apples comparison.

But clearly, we agree on the desirability of sophisticated environments that "allow us to write more concise, flexible code." It just might be surprising how close you can get to that in a statically-typed language. Consider the venerable subject-observer pattern. Here it is, in its entirety, in O'Caml:

class ['O] subject =
object (self : 'selftype)
  val mutable observers : 'O list = []
  method add_observer obs = observers <- obs::observers 
  method notify (message : 'O -> 'selftype -> unit) =
    List.iter (fun obs -> message obs self) observers
end

class ['S] observer =
object
end

Even if you don't know O'Caml, you can probably intuit what's going on: there's a "subject" class that's parameterized by the observer type. It maintains a list of observers. It has a method to add an observer to the list, and a method to notify all observers of something. That "something" is a function taking an observer and a subject as parameters. Then there's an observer class, which is parameterized by the subject type. By default, it does nothing.

Here's an example of its use:

class ['O] window =
object (self)
inherit ['O] subject
  val mutable position = 0
  method move d = position <- position + d; self#notify (fun x -> x#moved)
  method draw = Printf.printf "Position = %d\n" position
end

class ['S] manager =
object
inherit ['S] observer
  method moved (s : 'S) = s#draw
end

So now we have a window class and a manager class. Whenever a window is moved, it notifies its observers by calling their "moved" method. The manager's "moved" method calls the "draw" method on the window. Let's test it:

# let w = new window in w#add_observer(new manager); w#move 5;;
Position = 5
- : unit = ()

Cool, but it's not apparent at this point why this is cooler than the pattern in Scheme or Smalltalk or whatever. So let's mess around a bit.

class ['O] no_draw =
object (self)
inherit ['O] subject
  val mutable position = 0
  method move d = position <- position + d; self#notify (fun x -> x#moved)
end

Now let's try to use it:

# let w = new no_draw in w#add_observer(new manager); w#move 5;;
This expression has type
  (< draw : 'b; .. > as 'a) manager = < moved : 'a -> 'b >
but is here used with type < moved : 'c no_draw -> unit; .. > as 'c
Type 'a is not compatible with type
  'c no_draw =
    < add_observer : 'c -> unit; move : int -> unit;
      notify : ('c -> 'c no_draw -> unit) -> unit >
Only the first object type has a method draw

In my terminal window, the expression "new manager" is underlined, so it's clear that it's what's being referred to by "This expression." As you can see, the compiler gave us a very detailed error, and was kind enough to sum it up in the last line: "Only the first object type has a method draw." Similarly:

# let w = new window in w#add_observer(new observer); w#move 5;;
This expression has type 'a observer but is here used with type
  < moved : 'b window -> unit; .. > as 'b
Only the second object type has a method moved

We can't just stick a plain ol' observer on a window, because the plain observer doesn't have a "moved" method.

So with 10 lines of extremely simple code we have the subject-observer pattern, with the benefit traditionally ascribed to a dynamic implementation (you can put any "type" of observer on the subject that you want as long as the observer knows how to respond to the notification; the subject and observer can evolve completely independently of each other) but with the added benefit that if there is a mismatch between the methods supported by the subject and observer, the code will not compile, vs. experiencing a "method not found" exception at runtime.

This is where I think we're

This is where I think we're talking past each other: dynamic languages aren't "moving something that's normally done by the programmer into the compiler/editor." Quite the opposite! Typically (and I really should write "without exception that I'm aware of," because it's true of at least Scheme, Common Lisp, Perl, Python, Ruby, and TCL), dynamically-typed languages make it extremely difficult to do type inference even in a relatively simple type system such as Milner-Damas, let alone a richer type system such as, e.g. GHC 6.4's with GADTs and so forth.

If you give me an example I will gladly show you how it would be done in a hybrid system.

class ['O] subject =
object (self : 'selftype)
  val mutable observers : 'O list = []
  method add_observer obs = observers  'selftype -> unit) =
    List.iter (fun obs -> message obs self) observers
end

class ['S] observer =
object
end

type Observer(method) = any(x => hasattr(x, method))
def notify(objects:[Observer(method)], method, *args):
    for object in objects:
        getattr(object, method)(*args)

Mine's shorter ;-) Oh, and because I know you're thinking it: it's completely possible to check this at compile time. The compiler can and will perform the type checks (hasattr(x, method)) at compile time in the same situations as yours. The difference is, of course, that if mine can't determine the type at compile time, it assumes the programmer knows what he's doing and just inserts a check in case he doesn't. Yours, on the other hand, assumes the programmer is wrong and requires that he insert the proper checks, whether they're actually helpful or not depends on how responsible the programmer is, just like in my system. Now, I don't know about you, but I'd think having a programming language that assumes I know what I'm doing is much less annoying than one that assumes I'm always wrong. It's also worth noting: I'm advocating designing for responsible, knowledgeable people, not people that are perfect. Mistakes are inevitable and I fully recognize the fact that this needs to be addressed when designing a language, but I'm definitely against assuming everything is a mistake when it's not 100% certain that it is.

It's Hard to Compete...

...with code written in a language that doesn't exist, so...

Curtis W: Oh, and because I know you're thinking it: it's completely possible to check this at compile time. The compiler can and will perform the type checks (hasattr(x, method)) at compile time in the same situations as yours. The difference is, of course, that if mine can't determine the type at compile time, it assumes the programmer knows what he's doing and just inserts a check in case he doesn't.

...shall we say, assumes facts not in evidence. :-)

Curtis W: Yours, on the other hand, assumes the programmer is wrong and requires that he insert the proper checks, whether they're actually helpful or not depends on how responsible the programmer is, just like in my system.

Huh? The O'Caml "subject" and "observer" classes literally assume nothing (type them into the O'Caml toploop and you'll find that the inferred types contain no constraints), and the examples only contain code that's actually necessary to address the domain (type them into the O'Caml toploop and you'll find that the inferred types only constrain the objects to have the methods used in the code, and no more). There are no casts, no coercions, and certainly no checks, proper or otherwise, inserted. All that happens is that the compiler prevents me—absolutely—from mixing subjects and observers that can't actually communicate with each other.

Curtis W: Now, I don't know about you, but I'd think having a programming language that assumes I know what I'm doing is much less annoying than one that assumes I'm always wrong.

I personally prefer "trust, but verify." The O'Caml subject/observer implementation (which, BTW, isn't mine; it's straight out of "On the (un)reality of Virtual Types") is very nice: as far as the "subject" class is concerned, the "message" can be any function that takes an observer and a subject as parameters—and it keeps working no matter whether you subclass "subject" or "observer" or not, in parallel or not, as long as, when you compile the add_observer call, each class has the method(s) that are called by the other class. End of story. This is vastly preferable to potentially delivering code to the customer that throws a "method not understood" exception!

Curtis W: Mistakes are inevitable and I fully recognize the fact that this needs to be addressed when designing a language, but I'm definitely against assuming everything is a mistake when it's not 100% certain that it is.

Me too, which is why I prefer languages that, to the greatest extent possible, check for actual mistakes, vs. either of the undesirable cases of:

  1. Assuming that what you want to do is a mistake (in fairness, most popular statically-typed languages suffer from this to an appreciable degree)
  2. Letting you actually make a mistake and making your users pay for it

Huh? The O'Caml "subject"

Huh? The O'Caml "subject" and "observer" classes literally assume nothing (type them into the O'Caml toploop and you'll find that the inferred types contain no constraints), and the examples only contain code that's actually necessary to address the domain (type them into the O'Caml toploop and you'll find that the inferred types only constrain the objects to have the methods used in the code, and no more). There are no casts, no coercions, and certainly no checks, proper or otherwise, inserted. All that happens is that the compiler prevents me—absolutely—from mixing subjects and observers that can't actually communicate with each other.

I'm talking about the situations in which it's not possible to determine at compile time whether a given variable fits the requirements, e.g:

def create_compatible():
    event = ( => out.print("compatible"))
    return None::[event=event]

def create_incompatible():
    return None

type_map = {0:create_compatible, 1:create_incompatible}
observer = type_map[in.readline().to_int()]()
notify(observer, "event")

In that example, there are several potential points of failure that aren't handled--the most obvious being that the observer may or may not have the correct event. The other, less obvious points of failure are that the user might not input an integer or it might not be in the correct range. If you were to write this in a statically typed, non-casting programming language, you'd end up with something like this:

def create_compatible():
    event = ( => out.print("compatible"))
    return None::[event=event]

def create_incompatible():
    return None

type_map = {0:create_compatible, 1:create_incompatible}
input = in.readline()
if input.is_int():
    number = input.to_int()
    if number in [0..2]:
        observer = type_map[number]()
        if hasattr(observer, "event"):
            notify(observer, "event")
        else:
            raise "Observer doesn't have an 'event' method"
    else:
        raise "Number not in range [0..2]"
else:
    raise "Input is not integer"

Does this look like any other languages that you know that force you to do things? I don't know about you, but this is eerily close to Java and exceptions. The only way to avoid this fate in a statically typed language is to use casts, which are no better then just using dynamic typing.

Outside World

Let's stipulate, please, that the point at which a language has to communicate with the world outside the runtime is the point at which you have to have scaffolding to ensure that the interaction will be successful or somehow signal failure. Again, FFIs, marshaling/unmarshaling, and some categories of I/O all obviously exist at this intersection. These are interesting (as my posts elsewhere about Acute, Saffire, HashCaml, etc. attest), but they have nothing to do with the 90%+ of any given program of any size that aren't about FFIs, marshaling, or I/O.

Also, please do me a favor: if you're going to write code to demonstrate counterexamples, please use a real language: it's spectacularly unlikely that your intuition as to what can be done in a language is sufficiently accurate for claims about your hypothetical examples to be correct unless you've implemented half a dozen languages or so—at the very least, you should have worked through "Lisp In Small Pieces" and "Types and Programming Languages" if you haven't yet implemented any languages yourself. You might also wish to learn O'Caml, at least, since your earlier comments about the subject/observer pattern that I posted were incorrect.

Let's stipulate, please,

Let's stipulate, please, that the point at which a language has to communicate with the world outside the runtime is the point at which you have to have scaffolding to ensure that the interaction will be successful or somehow signal failure. Again, FFIs, marshaling/unmarshaling, and some categories of I/O all obviously exist at this intersection. These are interesting (as my posts elsewhere about Acute, Saffire, HashCaml, etc. attest), but they have nothing to do with the 90%+ of any given program of any size that aren't about FFIs, marshaling, or I/O.

That's true, but it doesn't negate my approach in the slightest.

Also, please do me a favor: if you're going to write code to demonstrate counterexamples, please use a real language: it's spectacularly unlikely that your intuition as to what can be done in a language is sufficiently accurate for claims about your hypothetical examples to be correct

Actually, you're right. I suppose O'Caml doesn't force you to provide an alternative, but it still requires you to use some sort of casting or similar method.

No, On Two Counts

Curtis W: That's true, but it doesn't negate my approach in the slightest.

True: both dynamically-typed and "hybrid," which I take to mean "dependently-typed," languages have interesting stories to tell around the issues of safe I/O.

Curtis W: Actually, you're right. I suppose O'Caml doesn't force you to provide an alternative, but it still requires you to use some sort of casting or similar method.

I'm unsure of the context here. If you mean doing I/O in, e.g. O'Caml and ensuring that the values brought into the runtime are of the types ascribed to them at the time of input, then you're almost right: you have to construct values of a type defined to the compiler from some other "dynamic" type (a string to parse, a byte-array to munge, etc.) For example, here's a function to read a 32-bit integer in little-endian format from an input stream in O'Caml:

(** This function reads an int32 that is encoded in little-endian order from the passed in_channel.
    @author Paul Snively
    @param in_chan in_channel
    @return int32 *)
let input_int32_le in_chan =
  let aux le =
    if String.length le <> 4 then 
      raise (Invalid_argument "int32_from_little_endian");
    let d3 = of_int (Char.code le.[0])
    and d2 = of_int (Char.code le.[1]) 
    and d1 = of_int (Char.code le.[2]) 
    and d0 = of_int (Char.code le.[3]) in
      (logor (shift_left d0 24) 
	 (logor (shift_left d1 16) 
            (logor (shift_left d2 8) d3))) in
  let buf = String.create 4 in
    really_input in_chan buf 0 4;
    aux buf

There's one visible failure case (aux explicitly checks the string length and raises an exception if it's wrong) and at least a couple of invisible ones (String.create can fail, really_input can fail in a couple of ways). Of course, what I imagine you're really referring to is the fact that, given an arbitrary input stream at an arbitrary position, there's no way to ensure that the next four bytes are, in fact, intended to be a 32-bit integer in little-endian order. That's the part that we appear to agree to stipulate—in this case, you can see the liberal application of type constructors taking some other type. That's not the same as casting—these operations are guaranteed to be safe—but there's a clear demarcation between the world of bytes and the world of int32s here.

If, however, that isn't what you're referring to, then I'm at something of a loss: you may be continuing to claim that, e.g. the subject/observer pattern in O'Caml requires casts or conversions, which has already been demonstrated to be false. That strikes me as unlikely to be the case, though, as you don't seem willfully obstinate. :-)

I'm unsure of the context

I'm unsure of the context here. If you mean doing I/O in, e.g. O'Caml and ensuring that the values brought into the runtime are of the types ascribed to them at the time of input

Yeah, pretty much.

Also, please do me a favor:

Also, please do me a favor: if you're going to write code to demonstrate counterexamples, please use a real language: it's spectacularly unlikely that your intuition as to what can be done in a language is sufficiently accurate for claims about your hypothetical examples to be correct unless you've implemented half a dozen languages or so—at the very least, you should have worked through "Lisp In Small Pieces" and "Types and Programming Languages" if you haven't yet implemented any languages yourself. You might also wish to learn O'Caml, at least, since your earlier comments about the subject/observer pattern that I posted were incorrect.

I'm confused. Are you saying that the code examples in my language are likely impossible, or that I don't know what's possible in other languages?

Broadly...

...both. I'm concerned because you've claimed that there's nothing about Python that would make type inference challenging, and you've totally misunderstood even the very simple O'Caml code I've posted. So when you post code in a language that doesn't even exist, all you're proving is that you can conceive of a syntax that you like. That doesn't mean that you can provide the syntax the semantics that you're claiming for it, any more than "Can God make a stone so big that even He can't move it?" tells us anything about large stones or God.

and you've totally

and you've totally misunderstood even the very simple O'Caml code I've posted.

What have I misunderstood?

So when you post code in a language that doesn't even exist, all you're proving is that you can conceive of a syntax that you like. That doesn't mean that you can provide the syntax the semantics that you're claiming for it, any more than "Can God make a stone so big that even He can't move it?" tells us anything about large stones or God.

This isn't some language I've conceived right here and now just for this discussion--I've been working on it and thinking about it for the past year. I'm fairly certain I know what's possible and what's not because I've already thought it through. If you don't believe it's possible, I'll gladly consider your argument, but I don't tend to be wrong on these types of things.

You might like to post

You might like to post semantics for the language somewhere then. When people post code in existing languages, we already know exactly what it means.

The semantics are in my

The semantics are in my description of any code posted and in the comparison between my code and his.

She means formal semantics.

She means formal semantics.

For certain values of formal

For certain values of formal - definition by interpreter works for me.

Driving Towards Specifics

Curtis W: What have I misunderstood?

A fair question. Your initial reaction to the subject/observer pattern was that it somehow "assumed the programmer was wrong" and "required checks to be inserted by the programmer." Neither of these observations were accurate. You then changed the discussion to type-safety around I/O, which we eventually stipulated is an issue, but it's still totally distinct from whether non-I/O code imposes any restrictions/constraints on the programmer.

Curtis W: This isn't some language I've conceived right here and now just for this discussion--I've been working on it and thinking about it for the past year. I'm fairly certain I know what's possible and what's not because I've already thought it through.

My point is precisely that absent extensive experience and/or formal training in designing programming languages, it's unbelievably unlikely that your language will actually prove implementable. Particularly when you talk about "hybrid static/dynamic systems," this is an area of open research into partial evaluation, dependent types, etc. People get PhD's in this domain, today. "Thinking it through" over the past year or so, particularly if you've never actually delivered a language before (I'm assuming, since you haven't commented on it) means quite literally nothing.

Curtis W: If you don't believe it's possible, I'll gladly consider your argument, but I don't tend to be wrong on these types of things.

Everyone tends to be wrong on these things. It's the study of type theory and programming language design, coupled with experience, that tends to reduce the error rate. Like Phillipa, I'll withhold further judgment until I see some kind of formal semantics.

A fair question. Your

A fair question. Your initial reaction to the subject/observer pattern was that it somehow "assumed the programmer was wrong" and "required checks to be inserted by the programmer." Neither of these observations were accurate.

That was in reference to static typing in general, not just your code. The latter is my fault; I was basically trying to say that static typing requires some sort of casting mechanism to be done by the programmer. If you kindly point to something you've said that I haven't refuted, I'll gladly do it now.

which we eventually stipulated is an issue, but it's still totally distinct from whether non-I/O code imposes any restrictions/constraints on the programmer.

Interaction with the outside world is exactly where static and dynamic typing differ and it's what I've been either talking about or referring to the entire time, although I've just realized that now.

Everyone tends to be wrong on these things. It's the study of type theory and programming language design, coupled with experience, that tends to reduce the error rate. Like Phillipa, I'll withhold further judgment until I see some kind of formal semantics.

What is it you need to know? I figured most of the code samples are pretty self-explanatory, especially since the type system isn't all that different from O'Caml's except for the dynamic typing and dependant types.

the type system isn't all

the type system isn't all that different from O'Caml's except for the dynamic typing and dependant types.

Dynamic dependent types must be about as different from OCaml's type system as you can get!


What they want is a full description of how your language works, preferably as formal semantics made up of rules but at least an accurate, unambiguous description of the meaning of your language's code.

Dynamic dependent types

Dynamic dependent types must be about as different from OCaml's type system as you can get!

Really, I was referring to the ability to have union types and type constructors, which aren't present in most languages. I'd say my type system is about as far away from python as it is from O'Caml.

What they want is a full description of how your language works, preferably as formal semantics made up of rules but at least an accurate, unambiguous description of the meaning of your language's code.

I know, I just don't think it's necessary for such a narrow topic as a type system.

It is, though. All too often

It is, though. All too often "just extend this type system with that feature" turns into a complete quagmire - you'd be amazed how much difficulty's been had with subtyping, for example.

I guess this is the closest

I guess this is the closest thing to an up-to-date spec. I do have a slightly old ebnf grammar for it, but I haven't touched it in at least a month or two.

Syntax != semantics

Syntax is not the same thing as semantics. A grammar tells us what a program should look like (syntax), but nothing about what the effect of executing that program will be (semantics). As Philippa has already mentioned, even an interpreter would give us some idea of what the semantics of the language are. Have you written an interpreter for this language?

The code example actual

The code example actual contains rather detailed explanations of what's going on. I could possibly get the grammar up to snuff. Maybe I should relabel that page as 'General Overview.'

OT: Large stones...

Just an aside, the stones question is just a way of asking whether something can be true and false at the same time...

To bring the question into a PL context, a better way to phrase the age old question:

Can Oleg find a single PL that combines all his insights (currently scattered amongst Scheme, ML and Haskell)?

By "easy", I meant "doable" :)

It's not going to be easy, but that's because you're moving something that's normally done by the programmer into the compiler/editor. To be honest, I'd rather have the development environment be more complex if it allows me to write more concise, flexible code.

Everyone here is probably in favor of a language that lets you write more concise and flexible code. But by "not easy", I meant "really, really hard". Hard to the point of being impractical.

The way I see it, highlighting and explicit casting have the same visual impact, the only difference being the the latter is done manually by the programmer whereas the former is done automatically by the editor.

I agree that they have the same visual impact, but as many others have said, an explict cast lets the programmer give the static checker some additional information.

Let's say you had an editor that highlighted type mismatches. In a language with explicit casts, the editor will only highlight a subset of the mismatches -- the subset that hasn't been explicitly OK'd by the programmer. In a language with no mechanism for the programmer to indicate "this mismatch is OK", the editor will highlight every single type mismatch; it has no way of knowing which ones you've already checked and approved of. I think you'd end up completely ignoring the editor warnings.

Throughout this discussion, I've been thinking of explicit casts in the context of unsafe downcasting in object oriented languages. For the example you brought up, it might be reasonable for there to be implicit conversions between the two types. For example, the 'type' keyword in Haskell actually allows the following:

type Name = [Char]
x :: Name
x = "abc"

Also, C# has a mechanism for the programmer to define implicit conversions.

Everyone here is probably

Everyone here is probably in favor of a language that lets you write more concise and flexible code. But by "not easy", I meant "really, really hard". Hard to the point of being impractical.

What makes you say that?

Let's say you had an editor that highlighted type mismatches. In a language with explicit casts, the editor will only highlight a subset of the mismatches -- the subset that hasn't been explicitly OK'd by the programmer. In a language with no mechanism for the programmer to indicate "this mismatch is OK", the editor will highlight every single type mismatch; it has no way of knowing which ones you've already checked and approved of. I think you'd end up completely ignoring the editor warnings.

Highlighting would only occur when casting would be required in other languages. For that statement to hold true, you'd end up with so many casts that you'd end up ignoring them in other languages. Maybe I haven't been clear enough on how it works--with a dynamically typed language, checking for type mismatches means checking for instances in which a value might not be valid for a given variable. This means that the example you posted above would not be highlighted, either.

Details

Maybe I haven't been clear enough on how it works

Yes, this is exactly why putting a more detailed writeup on your own site is a good idea.

Aye. Unfortunately, I'm more

Aye. Unfortunately, I'm more of a coder than a writer, so I guess that's to be expected.

Interpreter

Thats ok. If you can code up a small interpreter, then we can run your examples and see exactly what your talking.

Actually, I already have an

Actually, I already have an almost-complete interpreter, but I don't think it would be very helpful in explaining how type inference is used, considering my interpreter doesn't use type inference.

Going to Church

It would be good enough to show an interpreter of the explicitly typed language after type inferrence (i.e. in Church style). Of course, it would be more convincing if your examples had enough type annotations to pass your type checker so we won't have to guess how the inference works.

Of course, it would be more

Of course, it would be more convincing if your examples had enough type annotations to pass your type checker so we won't have to guess how the inference works.

Most of what I posted looks pretty obvious--do you have any specific examples in mind?

Beyond the primitives...

...what i'd prefer to see is something that doesn't revolve around pure primitives (ints, floats, strings, etc) and something along the lines of data abstractions (classes, modules, etc...). My thinking is that you quickly move away from the primitive types and into the user defined types rapidly.

I may not be the most reliable source for suggestions having spent way too much effort on this particular toy example, but perhaps a Shape example where you have a collection (list, array, etc...) of shapes (rectangle, circle, etc) that gets a type error when you attempt to add an object that is not derived from the shape class. So from a Python perspective, this should throw an error:

   scribble : Shape list = [Rectangle(10, 20, 5, 6), Circle(15, 25, 8), "NotAShape"]

From there, you could move out to more complex forms of type derivation and inference.

Well, your shape example

Well, your shape example would look something like this:

type Point = [number]::[move]
def point_move(point:Point, distance:[number](x => len(x)==len(point)):
    point[:] = zip(point, distance).map(([x,y] => x+y))

def create_point(point:Point):
    return point::[move=point_move]

data Rectangle:
    pos:Point
    width:number
    height:number

def rect_draw(rect):
    out.print(rect)

def create_rectangle(pos, width, height):
    return Rectangle(create_point(pos), width, height)::[
        draw = rect_draw
    ]

...

def draw_shapes(shapes):
    for shape in shapes:
        shape.draw()

draw_shapes([
    create_rectangle([0, 0], 10, 10)
])

draw_shapes([None]) # Error

I'm not really sure what this has to do with the discussion, though.

edit: Forgot to show use. Basically what happens is that the compiler can infer that shapes in draw_shapes has to be an iterable object which iterates over objects with a draw attribute. Since None doesn't have said attribute, it gets flagged at compile time.

Iterable object?

The discussion was about type inference and type checking. You say that we can infer that the type of the shapes collection has to implement an iterable interface, but that begs the question of what is the type. What if I have an unrelated class that also happens to implement the iterable interface. Well, I suppose we can then go out and find out that this item not only has to implement the iterable interface, we can also somehow infer that it needs a draw() method (or else it would get a doesNotImplement - or whatever that Smalltalk terminology is). So we've determined these two things, but we can still get an iterable class that has a draw method that is not a dervied from the shape class.

Of course, the Dynamic crowd will tell us that if it implements these methods then it effectively meets the type criteria (duck typing). But what would you call the type of shapes? And how do you go about restricting what is an allowable element within that collection?

Anyhow, this was not intended to be an exercise where you display untested code. This was just a simple starting example that might give a better reference point in determining how you infer the types of a collection (with the added bonus of having to contend with subtyping in the inference process), and how you would annotate and enforce the type for that collection when the user wants to limit it further than just a drawable and iterable item (maybe the programmer only wants rectangles).

The discussion was about

The discussion was about type inference and type checking. You say that we can infer that the type of the shapes collection has to implement an iterable interface, but that begs the question of what is the type.

The inferred type is [any::[draw]], which says, in english, "an object which can iterate over any objects with a draw method."

What if I have an unrelated class that also happens to implement the iterable interface.

That's fine, so long as it iterates over objects of type draw. This means that, for example, that the objects might not even be on the same machine that the program is running on--it could be retrieving them from a server.

So we've determined these two things, but we can still get an iterable class that has a draw method that is not a dervied from the shape class.

Ah, but you see, having a 'draw' method is being derived from the Shape class. If you actually wanted to define a Shape it would look something like this:

type Shape = any::[draw]

Which means exactly what it looks like--any object with a 'draw' method. There is no inheritance in my language; if you want an interface, use something like the above code, if you want reuse, just tack the functions you want to reuse onto the new class.

Of course, the Dynamic crowd will tell us that if it implements these methods then it effectively meets the type criteria (duck typing).

That's because it does meet the criteria.

And how do you go about restricting what is an allowable element within that collection?

If you want to artificially restrict it, just assign the variable a type.

(with the added bonus of having to contend with subtyping in the inference process)

Structural subtyping is the basis with which inference operates in my language. I hardly see how this is "contending."

Have you dealt with variance

Have you dealt with variance issues yet? Can you correctly type mutable and immutable lists parameterised on the item type?

I've no idea what variance

I've no idea what variance issues are :) I'm also not completely sure what you mean by "correctly type...lists parameterised on the item type." If you mean can I represent a list type in the type system, then yes:

type sequence(element_type) = any::[
    get:function(any)->element_type,
    set:function(any, element_type)->None
]

I'm still working on the syntax for function types since I haven't really thought about them much.

Can you correctly type "list

Can you correctly type "list of foo"? If you don't know what variance means the answer is probably no, whether you know it or not. We covered that one recently when someone else was talking about subtyping, anyone got a link handy? (I'd dig it up myself but it's late here)

Can you correctly type

Can you correctly type "list of foo"?

Yup. Using my previous post's definition of sequence, the type "list of foo" would be sequence(foo).

Variance

Variance is one of the subtyping issues Phillipa mentioned earlier. What are the subtyping rules for arrays? Is array(int) regarded a subclass of array(object)? When is one function considered a subtype of another?

An array (iterable), if you

An array (iterable), if you were to really go down deep, is essentially any object with an iter method that returns a generator object holding the given type. If you were to express that in code, it would look like this:

type generator(x) = any::[
    next:function()->x
]
type iterable(x) = any::[
    iter:generator(x)
]

(That's a slight simplification on the generator part, but it's essentially the same)

Through this definition, it becomes obvious that iterable(int) is, indeed, considered a "subclass" of iterable(any) due to the nature of the any type.

If the array in question's

If the array in question's mutable, that's a mistake - your typing would allow any instance of any to be written into it by anything that doesn't have the more specific type for it.

What do you mean?

What do you mean?

I've elaborated on in the

I've elaborated on it in the post immediately below.

An array of int can be used

An array of int can be used as an array of any, then it can have strings written into it for example.

Have you decided on your function subtyping rules? A similar problem will crop up then.

An array of int can be used

An array of int can be used as an array of any, then it can have strings written into it for example.

No, it can't. I gave you the definition for an iterable, which isn't modifiable. A list type would look something like this:

type list(x) = any::[
    get_item:function()->x,
    set_item:function(x)->None,
    ...
]

Well, I was explaining the

Well, I was explaining the current issue of variance for mutable arrays. I don't know how your language works. So list(int) is not a subtype of list(any)?

Nope.

Nope.

Concrete information would help

Perhaps it would help if you would

  1. Post a complete definition of your type system somewhere
  2. Describe your algorithm for type inference or type checking or whatever it is that your compiler is supposed to be doing

Haskell type synonyms

For the example you brought up, it might be reasonable for there to be implicit conversions between the two types. For example, the 'type' keyword in Haskell actually allows the following:

type Name = [Char]
x :: Name
x = "abc"

The reason that works is because the type keyword doesn't define a new type. It creates a type synonym, which is just a new name for an existing type (usually used for mnemonic purposes). There is no "implicit conversion" going on, because Name and [Char] are the same type.

If the editor can infer the

If the editor can infer the type and let me know of it, why shouldn't/wouldn't the compiler do the same thing? This is one thing I've really never understood about the dynamic typing argument: it frequently seems to come down to "let's do type inference somewhere other than the compiler."

The editor doesn't infer the type (at least not in Smalltalk). It checks to see if the message your sending has a handler somewhere in the system and suggests similarly named messages in case you mistyped.

Second - types are a red herring. I don't care what type something is, I care about what messages it responds to.

Third - compile time type inference - you can't do it in a live system with objects that can change shape/class etc (again - this is Smalltalk). Suppose I haven't defined that type at the time I write the code or I add a type to the system later or remove one - you'd have to recompile every method that may reference that type. This just isn't too feasible.

You seem to have a static view of software - I prefer to work with interactive systems that are live. Live systems morph - at times they enter inconsistent states - but these states are typically easy to recover from and to detect. Furthermore, I typically test as I write (often coding in the debugger) and are more likely correct than systems that die and are resurrected over and over again.

Recompilation

Third - compile time type inference - you can't do it in a live system with objects that can change shape/class etc (again - this is Smalltalk). Suppose I haven't defined that type at the time I write the code or I add a type to the system later or remove one - you'd have to recompile every method that may reference that type. This just isn't too feasible.

Actually, I suspect that this might actually be feasible and that, ironically, Self is the language to look for this. Of course, compile-time would occur at run-time in this scenario.

type inference in smalltalk

Perhaps saying it happens at "compile time" is misleading for Smalltalk, since very minimal compiling is done (at least for Squeak, with which I've done a good bit of work). That is, everything is dynamically dispatched anyway, so there is no specialization done compiling to bytecode.

It is important to note, though, that compiling does happen in ST - the implementation is just designed to accomodate runtime patching very easily.

While it is true that an efficient implementation of incremental type inference in ST might be complex, doing so could yield huge gains in performance and safety, while retaining the flexibility and liveness (WRT performance witness specialization for numerics in Exupery).

In fact, there is already (at least) one type-inferencer for ST: Chuck, by Lex Spoon. Although I haven't tried it yet, he claims it works well for 100+ KLOC.

One very nice thing about ST is the ability to browse all senders of a given message, in the entire system. That such things are robustly present makes the implementation of a type-inferencer even easier. But imagine being able to browse for all objects of a given *type*.


I appreciate the point about liveness, though -- and it would definitely be nice to see more work in this area for Haskell, OCaml, etc.

Smalltalk vs. Curtis vs. O'Caml

tblanchard: The editor doesn't infer the type (at least not in Smalltalk). It checks to see if the message your sending has a handler somewhere in the system and suggests similarly named messages in case you mistyped.

Yes, I know what Smalltalk does. :-) But I don't think Curtis was talking about Smalltalk; I believe he was talking about his language-in-progress.

tblanchard: Second - types are a red herring. I don't care what type something is, I care about what messages it responds to.

Types define what messages something responds to. More about this in a moment.

tblanchard: Third - compile time type inference - you can't do it in a live system with objects that can change shape/class etc (again - this is Smalltalk). Suppose I haven't defined that type at the time I write the code or I add a type to the system later or remove one - you'd have to recompile every method that may reference that type. This just isn't too feasible.

In a statically-typed system with structural typing, such as O'Caml, this simply isn't true. Let me use the subject/observer pattern again, since it's succinct and nicely demonstrates the points. I want a class that can notify a list of observers when something interesting happens. I want not to define what that "something interesting" might be. I also want no interdependencies between the subjects and observers. I want to be able to compile them separately, subclass neither, either, or both of them at will, and without recompiling the other. Here's my entire session, from my shell on. Everything following a "#" and up to a ";;" is what I've typed; everything after the ";;" up to the next "#" is O'Caml's response:


Valhalla:~ psnively$ ocaml
        Objective Caml version 3.09.2

# class ['O] subject =
  object (self : 'selftype)
  val mutable observers : 'O list = []
  method add_observer obs = observers <- obs::observers
  method notify (message : 'O -> 'selftype -> unit) = List.iter (fun obs -> message obs self) observers
  end;;
class ['a] subject :
  object ('b)
    val mutable observers : 'a list
    method add_observer : 'a -> unit
    method notify : ('a -> 'b -> unit) -> unit
  end
# class ['S] observer = object end;;
class ['a] observer : object  end
# class ['O] window =
  object (self)
  inherit ['O] subject
  val mutable position = 0
  method draw = Printf.printf "[Position = %d]\n" position
  method move d = position <- position + d; self#notify (fun x -> x#moved)
  end;;
class ['a] window :
  object ('b)
    constraint 'a = < moved : 'b -> unit; .. >
    val mutable observers : 'a list
    val mutable position : int
    method add_observer : 'a -> unit
    method draw : unit
    method move : int -> unit
    method notify : ('a -> 'b -> unit) -> unit
  end
# class ['S] manager =
  object
  inherit ['S] observer
  method moved (s : 'S) = s#draw
  end;;
class ['a] manager :
  object constraint 'a = < draw : 'b; .. > method moved : 'a -> 'b end
# let w = new window in w#add_observer (new manager); w#move 5;;
[Position = 5]
- : unit = ()

So this is interesting: the compiler happily let me define the subject class without knowing anything about the (not-yet-existent) observer class—the subject class is parameterized by the type variable 'O just so that it can be referred to in method signatures like the one for the "message" function that the "notify" method expects. But that's it—when you see what O'Caml responds with, you can see that it has no idea what the 'O or 'selftype actually is, and so it leaves them as type variables ('a and 'b). The observer class obviously doesn't know anything about the subject class. All it knows is that its subclasses will eventually be able to refer to the subject class, since it's parameterized by a type variable for it, 'S.

Now to use the pattern, I define the old standby, a window class and a manager class. The window class has a position and can move. It also knows how to draw itself. When the window moves, however, it notifies all of its observers that it has moved. How? By calling the "moved" method on the observers. How does it know that the observers have a "moved" method? It doesn't, or rather, by calling "moved" we've said that the observers must have a moved method. But that's all we've said, so O'Caml now knows that the observer type ('a in the response O'Caml gave) must be constrained, and it tells us what the constraint it inferred is: constraint 'a = < moved : 'b -> unit; .. >. This says, in O'Caml-speak: "'a can be any object type that has a 'moved' method that takes a parameter of type 'b," where 'b is the window type or any of its subtypes. It doesn't have to inherit from the observer class. It doesn't have to be of a class at all; you can instantiate immediate objects in O'Caml. It just has to have a "moved" method that takes a window.

So the "manager" class does exactly that: it says "given some notional type 'S, I implement 'moved' taking an 'S as a parameter, and all I do is call the 'draw' method on the parameter." As you can see, O'Caml again dutifully says "moved can take anything, as long as it has a 'draw' method."

So "window" and "manager" still don't know anything about each other. It's only when I say let w = new window in w#add_observer (new manager); w#move 5;; that the compiler attempts to satisfy the constraints that it inferred, and we see [position = 5] because "window" has called "moved" on "manager," and "manager" in turn has called "draw" on "window."

What happens if we try a mismatch?


# let w = new subject in w#add_observer (new manager); w#move 5;;
This expression has type < draw : 'a; .. > manager subject
It has no method move

The compiler won't compile the code because "subject" has no "move" method. OK, how about the other way?


# let w = new window in w#add_observer (new observer); w#move 5;;
This expression has type 'a observer but is here used with type
  < moved : 'b window -> unit; .. > as 'b
Only the second object type has a method moved

A plain observer has no "moved" method. Again, this is a compiler error, not an exception. There have been no exceptions at any time in this process. But neither, at any time, have I had to say "the window class and manager class are related to each other," nor have I had to declare anything other than that someday some types will exist, and when they do, I want to use their "whatever" method. I could put "subject," "observer," "window," and "manager" each in their own file, compile them separately, and link them. It works like a charm.

By the way, < moved : 'b -> unit; .. > is a type. It just happens to be a type that means "any object with a moved method taking a parameter whose type is not fixed," i.e. the method is polymorphic. Types are not (necessarily) the narrow fixed beasts that you seem to think they are. I especially recommend Code Reuse Through Polymorphic Variants for more on how O'Caml supports open programming without having to recompile existing code.

tblanchard: You seem to have a static view of software - I prefer to work with interactive systems that are live.

So do I, which is why I always have an O'Caml toplevel running.

tblanchard: Live systems morph...

Most assuredly.

tblanchard: at times they enter inconsistent states

Not necessarily.

tblanchard: but these states are typically easy to recover from and to detect

Categorically untrue for non-trivial systems (think 1,000,000 lines of code written by a team of a few dozen programmers over a decade or so).

tblanchard: Furthermore, I typically test as I write

Me too, or even before: writing tests first helps define my interface. I can (and do) define the signature for a module so that my unit tests compile, then I write the implementations, then I get the tests to pass. Note that this is all in the O'Caml toplevel. Then I copy the module signature into an .mli file, the implementation into a .ml file, and then I can use ocamlc or ocamlopt on them.

tblanchard: and are more likely correct than systems that die and are resurrected over and over again.

Certainly—by the time I commit to .mli and .ml files, my code's been aggressively tested and refactored, too.

The bottom line is that none of the things that you enumerate as benefits of dynamic typing are actually benefits of dynamic typing. It appears that I'm going to be spending the rest of my career battling these pernicious myths.

Termination is provable

Curtis W: I suppose static typing allows for solving the halting problem, as well? ;-)

Not quite, but certain type systems surely can guarantee termination.

Double *sigh*


This distinction is only becoming more indicative of static typing's value over time, rather than less, as we see with the work on type system support for concurrency, distribution, and security: Software Transactional Memory, TyPiCal, HashCaml, Acute, Flow Caml... all of these represent statically-typed solutions to problems such as type-safe marshalling, proving the absence of deadlock and race conditions, and proving code adherence to security policies, all before the code gets to the customer, and all of which are quite literally impossible to provide a dynamically-typed analog for.

The pity is that all that work on type systems never really makes it beyond the prototype level, whereas dynamically typed languages *are* doing much of the work nowadays (it would be interesting to estimate the vast amount of code written in shell, phl, perl, python, javascript and ruby that is silently churning away and keeping the net working). This is a small indicator for me that people seem to be more productive in dynamically typed languages.


Proving, as in, the code doesn't compile if it's wrong. Proving, as in: there's no need to test those features (of course, you still need to test features that can only be realized by the dynamic semantics of the language in question).

Sigh. If only life would be that simple...


as software is becoming more concurrent, more distributed, and (it is devoutly to be hoped) more secure, not less.

Alas, software is becoming less secure, less reliable and less robust (I can only speak of the particular niche I have gained a little experience in, which is embedded systems, including medical and security related applications).


If the future looks grim for anything, it's dynamic typing, which is really best suited for exploratory programming in poorly-understood domains with small teams of genius-level programmers (per Marvin Minsky).

You've heard of Ruby, yes? People are actively turning their back on that little static typing they have used (Java), as hardware has become fast enough to finally make the runtime-efficiency of highly dynamic languages irrelevant in most cases.

It's all about productivity and simplicity. This includes OS integration and library support. And in that area no statically typed language has up to now been able to marginally reach the productivity of languages like, say, Perl and Python. To really make a difference, statically typed languages have to break out of their ivory tower and become more accessible and more pragmatic, which seems to be difficult, when you have to teach someone monads just to print "Hello, world!" on the screen... ;-)


"Letting the programmer do as he wishes" is a great policy for a single person working on a pet project, or a small team of genius programmers. I can tell you from about a quarter century of industry experience, it's an unmitigated disaster for the industry.

That you need genius programmers to attempt any large scale software project in a dynamically typed language is IMHO largely a myth and I'd like to see some evidence for that.

I have seen and experienced enough disaster in languages that were statically typed (in the weak, C++-like meaning). I have also not seen enough evidence of "real" statically typed languages being in a better position. The most prominent use of a high-level language in a commercial, high-reliabilty setting is still Ericssons switch, which is written in a dynamically typed language. I have also not found any replacement for Emacs, even though static-typing advocates tell me that large numbers of people hacking away in a dynamically typed language are a recipe for disaster. The web I'm surfing in is driven by a good deal of Perl, PHP and Javascript.

Letting me do what I wish is something I have longed for quite some time. We do have to get that idea out of people's heads that programmers are dumb asses that have to be controlled every step of the way - more quality (which is hopefully what we all really want) will not be gained by restricting programmers, but by enabling them, increasing the productivity, the fun and the craftmanship. Good coders is still what we need most, as good coders make the choice of language less important.

Breaking out of the Ivory towers

To really make a difference, statically typed languages have to break out of their ivory tower and become more accessible and more pragmatic, which seems to be difficult, when you have to teach someone monads just to print "Hello, world!" on the screen... ;-)
Back to Isaac's point in his original post, my opinion is that both very dynamic and very static languages are gaining traction - and I don't think that is contradiction. Since you mention Perl, I'd think the Perl community can appreciate the ivory tower languages like Haskell, since Pugs has become an important part in helping direct the future of the language.

On a more general note, there are four PLs that interest me the most: Scheme; ML; Haskell; and Oz. If I had to list the most dynamic language in the universe, it would be Scheme (i.e. Lisp). If I had to list the most static language currently on the planet, it would be ML. The dynamic nature of Scheme makes Ruby look stale and boxed in. The static nature of ML makes Java look like a dynamic language with an unsound type system. Even though Scheme and ML exist on the very opposite extremes of the dynamic/static eqution, I find that these two languages share much more in common with each other than they do with languages like Ruby, Python, Java, or any of the other typical languages.

I would also say that is the cross-pollination between PLs that are quite different that provide us with some of most notorious innovations: Statically typed Simula giving birth to the dynamically typed ideas of Smalltalk. Dynamic typing inspiring HM Type Inference.

Anyhow, instead of being forced to decide between the two, perhaps we need to demand that the decision be less painful.

Amen

Thanks for saying many of the things I'd wished to say, except doing so successfully!

OT: OCaml question

Sorry to interrupt this dynamic/static debate, but I might as well make this a productive exchange for the sake of the SICP translation for OCaml. :-)

I haven't quite found my way around the OCaml documentation just yet - mostly relying on Andreas comparison charts to SML. So I can do abs(-3);; on an integer. But what's the abs function for floating point numbers. For that matter, where's the documentation for the equivalent of the Math.xxx functions in SML? And is there a way to do the standard add, subtract, multiply, and divide that will polymorphically cross the integer/floating point boundary.

Strangely enough, SML and Oz act the same way in terms of not mixing integers and floats, whereas Haskell's type classes seem to provide a bit more flexibility in terms of mixing types across operations.

abs_float

Ahh... Pervasives.

The pervasives link is what I needed.

Thanks.

in Buried Treasure static = bad

Of course I've seen plenty of articles that extol the virtues of dynamic Ruby/Python/... over the costs of static C++/Java/...

Buried Treasure is the first time I've seen that argument plus praise for Haskell OCaml Scala without the slightest acknowledgement that those languages are statically type checked.

imo Whatever we think about dynamic/static when we praise Scala as "a terrific functional language" in all honesty we need to admit that Scala is statically type checked.

ML and Scheme .. and Lua

I also find myself strangely drawn to both Scheme [hi Felix!] and a strongly-static language (for me, Haskell).

Perhaps it is good to remember the implementation of the (very dynamic) language Lua in OCaml: Lua-ML and the associated papers. (which has of course been mentioned on LTU, e.g. here and here)

I find myself thinking about programming with monads as (type-safely) building up S-expressions.

All Good Points

But they all come down to what's happening now—it's a commonplace even among static typing advocates that looking at C/C++/Java as the standard-bearers of static typing isn't fruitful. However, there are good alternatives that aren't prototypes, such as Standard ML, O'Caml, and Haskell, with interesting recent news such as that about Haskell becoming the tool of choice for the various (obviously, non-kernel) Linspire tools that they provide. My thesis is that in the more concurrent, distributed, secure future, tools such as Software Transactional Memory, a typed pi or blue or somesuch calculus, an information-flow type system, etc. will prove more popular in industry than the dynamic alternatives. I see that Peter van Roy has been kind enough to post as well; please see my reply to him for more about my thinking as to why that will ultimately prove to be the case.

The importance of static typing is exaggerated

But let's be clear: static typing is becoming more important, not less, as software is becoming more concurrent, more distributed, and (it is devoutly to be hoped) more secure, not less.

Erlang is a dynamically typed language that is used to build highly available distributed systems in industry. E is a dynamically typed language that provides the most sophisticated implementation of capability security in a distributed setting. Both examples go counter to your statement.

In my view, the importance of the static/dynamic distinction in typing is exaggerated (at least in the LtU discussions, where it is blown out of all proportion). The main point regarding typing is that a language must be *strongly* typed. But language expressiveness is even more important. We have just barely begun to scratch the surface in exploring language expressiveness. That is where the real gains are to be had. Erlang and E show what can be achieved with only modest gains in language expressiveness. They achieve this despite being dynamically typed! In my view, this shows that the advantages of increasing language expressiveness *far* outweigh any advantage obtained from static typing.

Postscript: See my recent talk at IRCAM for some examples of language expressiveness. The last example in the talk is an implementation of a contract net protocol in three lines. It uses a combination of higher-order functional programming, dataflow synchronization, and asynchronous message passing.

traction

The subject of expressiveness never seems to get any traction. People mostly seem to miss the point alltogether. The names, function definitions, Class systems, all the stuff that expreses semantics should describe the system in a language of the target domain. This is far from the case. If programming languages were expressive in their target domain many programming problems would go away. Can we talk "knowledge representation" here? Knowledge representation can easily be a part of the programming activity.

Expressiveness vs Static Typing

I think that static typing is something that adds to the expressiveness of a language. A good static type system allows the programmer to express important logical properties of programs in a machine readable and verifiable manner with a minimal burden.

Types as expression

Types are certainly "expressive", but typically not in any application domain. For instance the type Integer is expressive, but the type "population" is more expressive, and the type "population of Europe" is even more expressive. While we are talking population we might as well include some facts and constraints about populations in general. This thinking quickly leads to and overlaps Knowledge representation. For instance "class" systems are related but not identical to semantic nets but in general "classes" don't seem to be used to achieve semantics in any target domain.

It seems to me that you're

It seems to me that you're confusing static typing explicit types and dynamic with inferred types. While it may usually be the case, it's entirely possible to have typed variables in a dynamic language.

Semantics

Rule systems and/or knowledge representations can be compiled or dynamic. You don't have to buy into some colossal language system, simply decide on your semantics and do an interpreter for it. Personally I like to talk about semantics not implementation strategies.

An Important Point

I think we all can agree, especially here on LtU, that we're seeking gains in expressive power. I remain a huge fan of E particularly—it's the one-stop shop for correct security thinking today—and I still use Erlang as an example myself of how extreme reliability in software can be achieved.

With that said, your main point, which I take to be that strong typing is the key to reliability rather than static typing, is, I think, a crucial one. Certainly the point has been made that most software disasters of my, and others', acquaintance have stemmed from C/C++'s weak typing.

Where both Erlang and E suffer with respect to adoption by mainstream programmers, I believe, is in enforcing a particular unfamiliar runtime model (message passing, little or no state, lambda-as-objects) in addition to whatever hurdles a new syntax always seems to impose. Erlang also has the Lisp problem—"It's good for Ericsson and telecom work and not much else" is a fair-or-not bit of feedback, just as "Lisp is only good for AI" is. Finally, E has serious performance problems in its Java implementation that get in the way of a proper evaluation of its conceptual, vs. pragmatic, value.

Of course, Oz remains a huge inspiration—it has obviously avoided the limitations of the other dynamic languages in terms of expressive power and largely in terms of performance, although I'd love to see a JIT for it. I'd especially love to see a working Oz-E!

Nevertheless, what I believe will happen/am contributing to, slowly, in my own small way, is the development of a language that looks familiar to C/C++/Java programmers, will generate native code for a number of platforms, but under the hood relies on partial continuations, region inference, software transactional memory, maybe dependent types/Ontic-like semantics, etc. to finally provide a language in which the compiler will tell you, before delivery, if you violate any of a much broader range of invariants than you see with current statically-typed languages. This is essentially Tim Sweeney's vision, at least as I understand it, but let me be totally clear that the opinions I'm expressing are my own, not Tim's. This may indeed seem overblown—heaven knows I get too passionate about it—but I think people frequently overlook the point that once such a language is defined and implemented, people who right now suffer from attempting to get concurrency, distribution, and security right in any of the currently popular programming languages, whether that's C, C++, Java, C#, Python, Ruby, Perl, Tcl... will have someplace to go that isn't scary, either syntactically or semantically.

Now, that doesn't mean I'm running around telling people not to use, e.g. Oz. On the contrary; when someone asks me today about learning to program/learning CS, I point them to CTM and Oz and offer to work through the exercises with them. All I mean is that the distinction between batch-compiled and interactive is orthogonal to the distinction between statically- and dynamically-typed, and that we can create a language with the before-delivery guarantees of static typing in a much broader range of domains than is currently possible, with the syntactic and semantic "feel" of some blend of the currently popular languages. Particularly if the type system is dependently-typed or Ontic-like, I think the whole either-or perspective will ultimately recede. Unfortunately, as you know, statically-typed vs. dynamically-typed is one of the most fundamental design choices you can make, and so the debate continues.

Speaking of flexibility

Some of the weaknesses may be taken care of, but that certainly doesn't mean such a language would be as flexible as a dynamically typed one.

And a dynamically typed language doesn't mean it has the flexibility of a statically typed language. Last time I checked, neither Ruby or Python had macros.

Letting the programmer do as he wishes, performing error checking at compile time when sensible, and using an editor which actually tells you information about what's going on (e.g. types) pretty well negate most, if not all, of dynamic typing's disadvantages.

Except it's an impossibility for an editor to infer the types of certain variables (e.g method arguments and return types). Yeah, the editor could do some deep magic to infer some of those. But it's certainly not a robust solution.

Boo has duck typing, but is statically typed. I believe Ocaml has structural subtyping.

Dylan and Lisp seem to take a hybrid approach, which always seemed to be the most flexible to me. Dylan (although technically dynamically typed), was very agressive with type checking and you could declare types. The Functional Objects compiler gives out very detailed analysis of what it could infer/optimize.

The Ruby interpreter still works at the syntax tree level. So until it gets its VM (YARV), it doesn't seem all that "flexible" to me.

And as far as trends are concerned, I see most research geared towards statically typed languages. Even Lisp seems to have been put out to pasture in that regards.

heh

And a dynamically typed language doesn't mean it has the flexibility of a statically typed language. Last time I checked, neither Ruby or Python had macros.

Nor do they need them, but that's a different discussion.

Except it's an impossibility for an editor to infer the types of certain variables (e.g method arguments and return types)

It depends on your definition of "type." Mine includes compound types, e.g. int | float.

But it's certainly not a robust solution.

How so?

I don't need no stinkin' compiler

Nor do they need them, but that's a different discussion.

There's lots of things that Ruby or Python don't "need". Does Ruby "need" the metaprogramming capabilities that it has now? Or would it not need it if it didn't have it? But the point is that handwaving about flexibility doesn't really get us anywhere.

It depends on your definition of "type." Mine includes compound types, e.g. int | float.

You were talking about smart editors. I pointed out certain things that smart editors in purely dynamically typed languages can't do.

How so?

Because as stated before, there's certain things that smart editors won't know the types for, eg return types and argument types.

There's lots of things that

There's lots of things that Ruby or Python don't "need". Does Ruby "need" the metaprogramming capabilities that it has now?

Suffice it to say that macros could be improved.

Because as stated before, there's certain things that smart editors won't know the types for, eg return types and argument types.

What gives you that idea?

Unknowing

What gives you that idea?

How can an editor possibly know the type of an argument when there isn't a type declaration? Or how does an editor know the return type of a method/function when it can be anything?

common sense

"know the type of an argument when there isn't a type declaration?"

Here's how programmers know the type of a variable in dynamic languages: by context. You have a method Person.setName( x ). What is the type of x? Even with a stupid name like that, i'm pretty sure it's likely to be a string. Why?! Because it's common sense, and dynamic language programmers like to rely on that. That's why we mostly don't need IDEs or other codenannys: because common sense and simplicity go hand in hand in providing the programmer all context it needs to know.

Static typing is great for tools to assist programmers who rely on heavy, cluttered languages and its tons of frameworks coping with their severe limitations...

Wierd

I can't remember the last time I had a name field hold a String, rather than a structured Name object. For that matter, it's rare that I have an object with entity semantics have a primitive String as a field. Much safer to wrap those in strongly typed value objects. Saves a boatload of bugs. It's the same with numerics as well, unless they are actually used for arithmetic.

Context and common sense are very much in the eye of the beholder, and everyone's got different levels of safety and reliability constraints.

--Dave Griffith

abstractions all the way down

"a structured Name object"

Well, that sounds silly to me: eventually, i guess we'll be replacing all primitives with more "sophisticated" abstractions and make that a laudable goal by itself. But then, people go really bananas when influenced by statically typed declarations all around and complex frameworks which call for endless layers of abstraction.

I can even imagine an Age field comprised of about half a dozen methods, some 3 private instance variables and 3 getters and setters... oh joy! will it ever reach the bottom?

paraphrasing:
The more I know statically typed programming languages, the more I like Scheme...

Misunderstanding

I'm not necessarily talking about simply wrapping a string. If I see "setName" in an payroll application, I'm most likely going to assume there is some Name object that includes stuff like first name, last name, prefix, title, and probably a half a dozen other things if it's an international payroll application. If I see "setName" in an XML processing library, I'm going to assume that it takes some Name object that understands the intricacies of XML namespaces. If I see it in a Java class processing application, I'll assume the Name object supports packaging and qualification. I've seen all of these in the wild, in code I would consider mature and well-engineered.

Consider "setURL". In some applications, a URL would be reasonably represented as an unstructured string. In other applications, that would be a Really Bad Design, as it would require scattering URL parsing and formatting code throughout the codebase.

I can even imagine an Age field comprised of about half a dozen methods, some 3 private instance variables and 3 getters and setters... oh joy! will it ever reach the bottom?

In a life insurance pricing application (which I've written in past lives), an Age object that simple would be seen as a woefully under-engineered first draft, and simply storing age as an integer would be laughably naive. It's all about what the domain requires.

The "common sense" you're recommending is actually context, and only works if all of the developers on the team share that context. If something as supposedly simple as "setName" can mean four completely different things in four different application domains, how can it possibly be common sense?

If all of the developers on your team share context, then you can certainly ship code without the infrastructure of static typing. Have fun. I know I have writing dynamically typed code. Unfortunately sharing context gets more difficult as the teams get larger, less experienced, less stable, longer lived, or more globally dispersed. In those cases (which is to say, most commercial development), you'll need to make more and more of your context explicit, either through documentation or language constructs.

overengineering

"If I see 'setName' in an payroll application, I'm most likely going to assume there is some Name object that includes stuff like first name, last name, prefix, title..."

how about

def setName( self, name ):
  self.name = Name( name )
  ...

and then
person.name.setPrefix( "Sir" )

etc...

thing is, the java way of thinking go for complicated matters, like:
Name name = new Name();
name.setName( "Alfred" );
name.setLastName( "Hitchcock" );
name.setPrefix( "Sir" );
person.setName( name );

let interfaces be simple! Let the setName method accept a simple string -- because that's what it is really -- and let the object dealing with it parse it and chop it down to the many parts if it will.

"In some applications, a URL would be reasonably represented as an unstructured string... it would require scattering URL parsing and formatting code throughout the codebase."

I agree with you. Nobody likes redundant scattered code. That's why we put them in functions. I'm really talking more about the interface to the problem, which should be simple and follow common sense. setURL to me, should accept a string, because it is a string. Whatever you do to it internally isn't really my problem... you may be willing to have a url.getParams but i could just have a getURLParams function instead operate on the string.

"an Age object that simple would be seen as a woefully under-engineered first draft"

you realize i was joking, right?

"simply storing age as an integer would be laughably naive."

An age is an integer, with a few constraints attached. Whether those constraints are enforced by a static type system or verified at run-time is a matter of chance.

"The "common sense" you're recommending is actually context, and only works if all of the developers on the team share that context."

yes, it's difficult to realize what is common sense and what is not if even highly trained and educated individuals in the same team can't agree on it. It's refreshing to know the sun is still the sun, though...

"If something as supposedly simple as "setName" can mean four completely different things in four different application domains, how can it possibly be common sense?"

It just matters that Person.setName -- my example -- means just one thing. And that it be common sense, or the least common denominator, or primitives if you will.

"you'll need to make more and more of your context explicit"

bureaucracy. contracts and more contracts, because suddenly people are very careful with wording with fear of being misunderstood. And you know what? The more severe and cluttered the contracts, the more the confusion...

The "complicated" way could

The "complicated" way could be written more like this in saner languages:

person.name := Name {first = "Alfred",
                     last = "Hitchcock",
                     prefix = "Sir"}

That doesn't seem too bad to me. Similarly, you could do this if you want to parse a string:

person.name := parseName "Sir Alfred Hitchcock"

The problem isn't the precision, it's that Java makes the syntax too verbose to read off easily.

Even in Java

Even in Java you can write:

person.name = new Name("Sir", "Alfred", "Hitchcock");

I'd be amused to see a language where this isn't trivial.

bureacracy

Yes, you can. You can also be verbose in python as well. Can be is not the same as to be. This whole culture of tons of explicit declarations, extreme loose coupling, endless abstractions and trust on the compiler to pull it all off together is the problem. It's the bureacracy thing i talked about.

Frankly, i'd really just do this in python:

person.name = "Sir Foobar of Lionhead"

If it's simply a string attribute or a property which parses the string given and generates a Name object or other complex scheme... it's not really my concern.

What matters is that interfaces be simple, the language rich and flexible and people begin to believe more in the "principle of least surprise" when designing interfaces and in common sense when using them rather than relying on multipage contracts...

Try reading this time

Here's how programmers know the type of a variable in dynamic languages: by context. You have a method Person.setName( x ). What is the type of x? Even with a stupid name like that, i'm pretty sure it's likely to be a string. Why?! Because it's common sense, and dynamic language programmers like to rely on that

Try reading, we're not talking about programmers, we're talking about editors.

That's why we mostly don't need IDEs or other codenannys: because common sense and simplicity go hand in hand in providing the programmer all context it needs to know.

Yeah right...that's why all these editors are always trying to bring IDE functionality via plugins. Hey, I have a suggestion. Just use notepad. You don't need that smart indenting crap.

Static typing is great for tools to assist programmers who rely on heavy, cluttered languages and its tons of frameworks coping with their severe limitations...

No, we're not talking about Java.

re:

"we're talking about editors."

it's still relevant, regardless. Besides, a programmer is an editor, in a sense... ;)

"Hey, I have a suggestion. Just use notepad."

no thanks. vi is far ahead of that junk.

"we're not talking about Java."

it was a comparison needed to make the point that programmers of dynamically typed languages mostly don't need thoughrough code assistence during programming because simplicity and common sense are primary design guides rather than the code bloat and declaration paranoia of Java, C# and related mainstream techs...

still, your arguments about the editor having a hard time providing code assistance to discover types of arguments would also prove to be true regarding Haskell and OCaml code. So the matter here is implicit types, not dynamic vs static typing.

thankfully, haskell and ocaml also are much saner than java and company, so IDEs and smart editors are also pretty redundant.

more re:

it's still relevant, regardless. Besides, a programmer is an editor, in a sense... ;)

Uhhhh..no, a human programmer is not a software editor in any sense, but we'll leave it that.

no thanks. vi is far ahead of that junk.

Sounds like you need a codenanny.

it was a comparison needed to make the point that programmers of dynamically typed languages mostly don't need thoughrough code assistence during programming because simplicity and common sense are primary design guides rather than the code bloat and declaration paranoia of Java, C# and related mainstream techs...

Yeah, and once again, I wasn't talking about Java. Take a look at something like Nemerle

still, your arguments about the editor having a hard time providing code assistance to discover types of arguments would also prove to be true regarding Haskell and OCaml code. So the matter here is implicit types, not dynamic vs static typing.

Haskell and Ocaml both have optional type declarations, and sometimes need them. I believe in Haskell it's a common idiom to declare types for functions anyway.

thankfully, haskell and ocaml also are much saner than java and company, so IDEs and smart editors are also pretty redundant.

Then why are you using Vi? Isn't that redundant? Do you ever use '%' to find your braces? I've got the supertab plugin for vim that I love for completion. I type fast, but I don't particularly want to type everything out.

What about Lisp? Lisp would be a nightmare without a smart editor that showed you the matching parantheses.

smart

"a human programmer is not a software editor in any sense"

yes, i guess vi does the editing all by itself...

"Sounds like you need a codenanny."

superior general text editing is not the same as a syntax and semantic aware IDE writing code for you.

"I wasn't talking about Java. Take a look at something like Nemerle"

great! Nemerle doesn't call for VS or any other smart editor either! :)

and, in fact, would feature the same problem of all languages without type declarations, but that's ok because programmers writing and reading the code can figure it out from context and will be thankful for the uncluttered clean code anyway...

"Haskell and Ocaml both have optional type declarations, and sometimes need them."

still, a smart editor would have no way to guess types of arguments either during the writing of the function. if it's optional, you can't count on it...

"Then why are you using Vi? Isn't that redundant?"

No, vi is just a very good general text processing tool. It's not smart, and does not perform semantic analysis of source text to provide type information and completion, it doesn't feature "friendly wizards" and writes code for you. It's just a very good tool for text editing.

"Lisp would be a nightmare without a smart editor that showed you the matching parantheses."

any barely decent text editor should have such a feature.

Manly men use notepad

yes, i guess vi does the editing all by itself...

Exactly, and thanks for realizing that an editor is not a human programmer.

"Sounds like you need a codenanny."

superior general text editing is not the same as a syntax and semantic aware IDE writing code for you.

Wrong again. Why does Vim have filetype plugins if its not syntax aware?

"I wasn't talking about Java. Take a look at something like Nemerle"

great! Nemerle doesn't call for VS or any other smart editor either! :)

Nope, and neither does Java.

and, in fact, would feature the same problem of all languages without type declarations, but that's ok because programmers writing and reading the code can figure it out from context and will be thankful for the uncluttered clean code anyway...

Wrong again. It's statically typed, and already has a completion engine built into the compiler library. So you get the best of both worlds.

No, vi is just a very good general text processing tool. It's not smart, and does not perform semantic analysis of source text to provide type information and completion, it doesn't feature "friendly wizards" and writes code for you. It's just a very good tool for text editing.

Does that make you feel better about yourself because you don't have a "wizard" for a template? But vim has many of those "smart" features, so maybe you're not so hardcore after all.

"Lisp would be a nightmare without a smart editor that showed you the matching parantheses."

any barely decent text editor should have such a feature.

Oh come on, like you say, you don't need those smart editor features. it's a codenanny. Can't you count?

eh?

Your first statement is solved by pretty much standard type inference processes. The second one, although unlikely, is inferred as any, of which the only real case I can see this happening in is when you don't use the argument in the function. Otherwise, it'll be inferred using either compound types or structural subtyping, e.g. def print(x): out.write(x.str()) would accept any object with a str function.

Uhh, no

Your first statement is solved by pretty much standard type inference processes.

You write a method with its arguments. At the point of declaration, it is impossible for the editor or compiler to know what the type is. The next step is for the editor to infer a type based on method usage within the method body. And if the parameter is passed into another method, then the editor needs to dig down into that method too.

At this point, you've basically written a type-inferencer for your dynamically-typed language. The next step is to plug it into a compiler, and wala, you've now changed Ruby into a statically-typed language.

Yes, the example is a bit silly, but indicative of all the hoops that you would have to go through to get what is fairly trivial with statically-typed languages.

The second one, although unlikely, is inferred as any, of which the only real case I can see this happening in is when you don't use the argument in the function. Otherwise, it'll be inferred using either compound types or structural subtyping, e.g. def print(x): out.write(x.str()) would accept any object with a str function.

I think you're getting my first and second statements confused or something. My first statement was about types of arguments, and my second statement was about return types.

Now for return types you would basically have to do what I described in the first example and have the editor construct a structural subtype for its own use. Go through the method body, and find all return paths, go through the objects that are returned, find all their methods, and setup an intersection of all the methods of all the return types.

Now if you can show me such an editor, I would be thoroughly impressed.

At this point, you've

At this point, you've basically written a type-inferencer for your dynamically-typed language.

That would be the point, yes.

The next step is to plug it into a compiler, and wala, you've now changed Ruby into a statically-typed language.

That would be the first step, actually. The second step would be using the type inference engine from the compiler, which was abstracted into a library, in the editor.

Yes, the example is a bit silly, but indicative of all the hoops that you would have to go through to get what is fairly trivial with statically-typed languages.

I'd rather have a more complicated compiler to allow for more simple code than a simple compiler with complicated code.

Now if you can show me such an editor, I would be thoroughly impressed.

It wouldn't be terribly hard to implement with the type inference engine already present as a library. In fact, I plan on doing just that in a little bit.

If type inference was easy...

...it would have already been done. And by easy, I mean theoretically possible. It's one thing to demand something really hard from the environment. It's another to demand that a to-this-point intractable problem be solved.

what?

Apparently we've got a communication problem somewhere. The type of type inference I'm talking about is completely possible.

Depends on the goal

Most existing soft type inferencers have tried to be perfect, attempting to compete with fully statically checked languages, rather than just producing some results that programmers would find useful.

Of course, doing the latter will never make people like Paul happy, and it may not be that sexy on a grant application either. But if the goal is simply to provide useful information to DT programmers in an IDE, then none of that matters. There's a lot that could be done quite easily that isn't being done today. The Smalltalk refactoring browser, while not exactly the same kind of thing, hints at what's possible.

However, one caveat is that the usual DT suspects like Python and Ruby may not be the best languages to which to apply this sort of thing: they're often gratuituously dynamic — dynamic in ways which cost heavily in terms of static analysis but don't buy much. So for the mainstream DT languages, such tools may be a long time coming.

curious

However, one caveat is that the usual DT suspects like Python and Ruby may not be the best languages to which to apply this sort of thing: they're often gratuituously dynamic — dynamic in ways which cost heavily in terms of static analysis but don't buy much. So for the mainstream DT languages, such tools may be a long time coming.

I'm curious. Why do you think python and ruby are "gratuituously dynamic"? I happen to be writing a language similar to python/ruby which will be compiled, but I haven't found anything that might cause me difficulties due to its dynamic nature.

Meouch

(The sound made by the proverbial cat)

By "gratuitously dynamic", I mean that those languages have various dynamic features that are often used to do things which can often be done just as easily without the use of dynamic features. Classic examples are things like the ability to add methods or fields to an object at runtime, and various other uses of metaclasses; or use of eval or exec to interpret source code at runtime. There are a number of other more subtle issues, as well.

Some of these issues are discussed in a post by Joe Marshall on the PLT Scheme mailing list.

If you restrict yourself to a more static subset of the language, or restrict your inferencing attempts to a more local context, you can do OK within those constraints. Existing attempts to do type inference in Python all do just that, in various ways.

Since this is a subject in its own right, I've posted a new story about Type inference for Python, with links to some articles and papers on the subject. Let's move any discussion of that subject there, to avoid interfering with the ongoing holy war here. :)

Another Excellent Point

As usual, Anton's making good sense here: certainly when I look at a Python or Ruby, I see an architecture that seems basically hostile to pinning anything down. Everything's open to being overridden. What's sometimes painful is that I can recall being a huge fan of that approach not that many years ago—I can turn around and pull my copy of "The Art of the Metaobject Protocol" off of my bookshelves in less than five seconds. But I got into that stuff when I really was surrounded by genius programmers all the time—my ICOM Simulations (Darin Adler would go on to be technical lead of System 7.0 at Apple among other things; Waldemar Horwat would go on to define the Javascript 2.0 standard among other things) and Apple Computer days. Later experience in frankly less heady environments revealed to me that languages and frameworks should help programmers, rather than expecting programmers to help them.

Having said that, there's still a huge amount to explore along the dimensions that Peter van Roy and Anton have both driven at: somewhere between the MOP-based Rubies and Pythons of this world and the Haskells and O'Camls of this world must lie some systems that clearly delineate what is known statically and what is OK not to know statically in the interest of expressive power. Dylan was singularly inaptly named because it was explicitly an attempt at just such a delineation (albeit having been developed before the definition of much that's great about modern type theory). Likewise the type inference of CMU CL/SBCL: a fine idea, but Common Lisp is another language that ultimately doesn't lend itself to supporting a rich type system. The bottom line for me is that it's not true that I want to prove everything at compile time, but I do want to prove the tough stuff at compile time. I don't think it will take much to see dramatic qualitative improvements in big software products. Consider Tim Sweeney's observation that about 50% of the bugs in the Unreal 3 Technology codebase would go away with just four features that modern popular languages don't have. Why wouldn't we want that from our next language?

perspective

By "gratuitously dynamic", I mean that those languages have various dynamic features that are often used to do things which can often be done just as easily without the use of dynamic features. Classic examples are things like the ability to add methods or fields to an object at runtime, and various other uses of metaclasses; or use of eval or exec to interpret source code at runtime. There are a number of other more subtle issues, as well.

Ah, I see. Indeed, those items you listed aren't particularly used very much in python. Essentially, they make up python's metaprogramming capabilities and they're in the language for the same reason macros are in common lisp.

If you restrict yourself to a more static subset of the language, or restrict your inferencing attempts to a more local context, you can do OK within those constraints. Existing attempts to do type inference in Python all do just that, in various ways.

Key word there: restrict. I must be missing something; the way I see it, python's flexibility doesn't hinder type inference in the slightest.

Slight hindrances

Key word there: restrict. I must be missing something; the way I see it, python's flexibility doesn't hinder type inference in the slightest.

If the flexibility didn't hinder inference in the slightest, you wouldn't need restrictions. Once you introduce restrictions, the inferencer will only be capable of certain kinds of inference, and not others. The question then is how useful it'll be in practice. There are many simple kinds of inference that are easy to do, but these are usually the least useful to a programmer. Alternatively, you may end up inferring complex union types that are also of little practical use to the programmer. Or the ability usefully identify potential type errors may be limited. The details are everything here — without seeing exactly what you're doing, I can't comment more specifically.

Alternatively, you may end

Alternatively, you may end up inferring complex union types that are also of little practical use to the programmer.

This is the exact method I'm proposing, although I don't consider it a restriction at all. Could you explain?

Useless types

I think he was saying that you need to restrict the language you'd be applying the type inferencer to to get useful types. For example, if I'm not mistaken in R5RS Scheme you can write at any point in the program (set! + someotherfunction). Thus, the type of something as simple as (+ 1 2) must be, at any use, some kind of universal type. Reflective capabilities are things just like this. This is why Ehud, I believe, often makes the statement that reflection is something that is overly powerful and research should go toward more controlled forms of it.

Of course if the "exact method [you are] proposing" is to infer "types [...] of little practical use to the programmer" is the problem not clear?

Overall this seems related to soft typing. I criticize soft typing as a means to achieve the benefits of static typing while attempting to circumvent the costs in this subthread (with Anton). The condensed version is: Either you program in a "dynamic style" and the types are useless or you program in a "static style" and have exactly the same "costs" of static typing with probably less of the "benefits". You cannot use that "static style" when and where you feel like it because it is easily poisoned by "dynamic" parts. So, carefully managing the boundary between static and dynamic is necessary to maintain the static parts, but this is exactly what happens in statically typed languages including a type Dynamic or similar features.

This is what I was getting

This is what I was getting at when I mentioned the "system-wide guarantees" of static typing. These are diluted if you use soft typing.

Instead, I'd start with static typing and add dynamic features. Take an expressive enough type system, say Epigram or Ontic-like, that is able to express the exact types you want - not the traditional restrictive ones. Already the system has a lot of the flexibility of dynamic typing. Then add runtime type checking where it is useful (dynamic loading of code, etc) and a Dynamic type for when you really want dynamic typing.

I think he was saying that

I think he was saying that you need to restrict the language you'd be applying the type inferencer to to get useful types. For example, if I'm not mistaken in R5RS Scheme you can write at any point in the program (set! + someotherfunction). Thus, the type of something as simple as (+ 1 2) must be, at any use, some kind of universal type.

Ah, I see. That's not terribly difficult, it just involves finding the type of every possible value. You might say that this might not be as useful as a single type, but to get the same flexibility in a statically typed system you end up with the same exact thing, just that the programmer had to do it manually.

Ah, I see. That's not

Ah, I see. That's not terribly difficult, it just involves finding the type of every possible value.

What is not that difficult? Finding the type of every possible value of what? +? To get anything other than a universal type for (+ 1 2) in the above circumstances, you'd need at the very least a whole program static analysis. You'd have to find every place in the program that can potentially assign to + and every type the assigned values could be.

but to get the same flexibility in a statically typed system you end up with the same exact thing

The whole point of the example is this is a bit of flexibility that you would not (unrestrained) have in a language you expect to be able to generate meaningful types for.

What is not that difficult?

What is not that difficult? Finding the type of every possible value of what? +? To get anything other than a universal type for (+ 1 2) in the above circumstances, you'd need at the very least a whole program static analysis. You'd have to find every place in the program that can potentially assign to + and every type the assigned values could be.

I don't see a problem with analyzing the whole program.

The whole point of the example is this is a bit of flexibility that you would not (unrestrained) have in a language you expect to be able to generate meaningful types for.

Again, I see no problem with it. Besides, when was the last time you set + to anything?

You might not have the

You might not have the entire program available at compile-time

At which point the compiler

At which point the compiler makes suitable compromises. I suspect it might be a good idea to optionally allow the programmer to specify types of dynamic modules, though.

Combinatorial Explosion

Unless you take carefully thought-out counter-measures, this kind of analysis is more than likely to explode. Unfortunately, (1) finding suitable "counter-measures" is totally a non-trivial problem, even for comparably simple languages, and (2) they will either restrict your language, or significantly weaken the analysis, or both.

Please don't make naive assumptions or claims about the simplicity of such approaches until you have at least some basic knowledge of the field. These are very hard research problems, with sometimes disillusioning results.

Um, what Philippa and Andreas said

For Philippa's point:

You very likely will not have all the source-code/object-code at compile-time. The source-code/object-code may not even exist when the program is (initially) compiled or linked. Some of the source code may well be in a different language.

For Andreas' point:

You'd have to find every place in the program that can potentially assign to + and every type the assigned values could be.

Do you have any idea what the above quoted statement entails?

Finally,

Besides, when was the last time you set + to anything?

Irrelevant. But even then, it is not just + that has to be handled. It is every single function in the program. While set!ing + would be a pretty crazy action, set!ing a user-defined function is much more likely.

on the contrary..

setting + to a different function is a standard way of providing overloading, in Scheme:

(set! +
  (let (old+ +)
    (lambda (a b)
      (if (my-special-type? a) (my-special+ a b)
          (if (my-special-type? b) (my-special+ b a)
              (old+ a b)
          ) ) )
   ) )

(two-parameter version for simplicity)

an old Usenet comment on language efficiency

Fergus Henderson explains why global analysis of Prolog is impractical.
It's quite a different language but some of the same issues.

Soft typing

At this point, you've basically written a type-inferencer for your dynamically-typed language. The next step is to plug it into a compiler, and wala, you've now changed Ruby into a statically-typed language.

Yes, the example is a bit silly, but indicative of all the hoops that you would have to go through to get what is fairly trivial with statically-typed languages.

If your compiler doesn't force you to run the type inferencer, though, or doesn't fail to compile if typechecking fails, then what you'd have is a dynamically-checked language which warns you about potential type errors. A particularly useful place to do that might be inside an IDE.

Go through the method body, and find all return paths, go through the objects that are returned, find all their methods, and setup an intersection of all the methods of all the return types.

Soft typing systems do exactly this.

Now if you can show me such an editor, I would be thoroughly impressed.

There's not much technical that prevents this from being done in some form. How comprehensive the type inference would be would depend on the language - some languages make it easier than others to do inference. If you're willing to accept "holes" where the inferencer can't figure out a useful type, or just punts with a very general type (like "Object"), then it becomes a relatively easy task. (Well, I haven't actually done it personally, but I'll accept a research grant...)

hmm

"a dynamically typed language doesn't mean it has the flexibility of a statically typed language. Last time I checked, neither Ruby or Python had macros."

Last time i checked, macros had nothing to do with static checks, since they run before the compiler/interpreter comes into place. In fact, macros are historically thoughroughly associated with Lisp, a dynamically typed language -- with optional static typing in the case of Common Lisp...

"I see most research geared towards statically typed languages"

And i see most practical, pragmatic folks using and advocating use of dynamic, flexible and "agile" languages and frameworks.

Last time i checked, macros

Last time i checked, macros had nothing to do with static checks, since they run before the compiler/interpreter comes into place. In fact, macros are historically thoughroughly associated with Lisp, a dynamically typed language -- with optional static typing in the case of Common Lisp...

No they don't, please see the context of the subthread to the point of that comment.

And i see most practical, pragmatic folks using and advocating use of dynamic, flexible and "agile" languages and frameworks.

No you don't. Think about what you posted there.

re:

"please see the context of the subthread to the point of that comment"

yes, i did that before posting. I felt the need to reply because i felt odd that you implied that a lack of macros in ruby/python was a problem for them being dynamically typed, though Lisp counters that...

"Think about what you posted there"

that's what i do before posting. do you?

More context

Some of the weaknesses may be taken care of, but that certainly doesn't mean such a language would be as flexible as a dynamically typed one.

That was what I was replying to. And making the point that saying DT languages are universally more flexible than ST languages is silly.

"Think about what you posted there"

that's what i do before posting. do you?

Here's what you said:

And i see most practical, pragmatic folks using and advocating use of dynamic, flexible and "agile" languages and frameworks.

Obviously, that is meaningless handwaving. But I can turn it around and say that the "pragmatic, practical" folks are advocating and using Java.

That was what I was

That was what I was replying to. And making the point that saying DT languages are universally more flexible than ST languages is silly.

They are when you compare the type system. I thought the meaning was pretty obvious, but apparently not.

You're being picked up on a

You're being picked up on a cultural default that goes in favour of dynamically-typed languages - many people genuinely forget that there might be another kind of flexibility because it's the only sort that people talk about without a qualifier.

Huh, what?

They are when you compare the type system.

They are more flexible when you compare the type systems? Can you explain that one?

Dynamic typing doesn't get

Dynamic typing doesn't get in your way.

Of Course Not

Curtis W: Dynamic typing doesn't get in your way.

Of course it doesn't—by definition, anything that allows you to talk nonsense isn't getting in your way. The reason that dynamic typing works well in most current situations is that most current situations lend themselves to attack by the intuitions that have arisen in computing's first half-century or so.

My poorly-articulated thesis is that this is changing: the demands for good approaches to the tough stuff, such as concurrency, distribution, and security—issues that aren't well-addressed by intuitions arising out of computing's first half-century—are rising. You can address them dynamically by baking assumptions about your runtime model into the language (stateless message passing like Erlang's, for example) and forcing your language's users to adopt that paradigm, leaving open the remaining questions of how you know that you won't run into trouble at runtime because testing can't reveal, e.g. the absence of deadlock or security leaks; or you can address them via a good type system and carefully designed syntax that mimics the programmers' existing intuitions, but offers the additional benefit of rejecting incorrect code before it gets into users' hands. Modern language design is doing an excellent job of coming up with good type systems; the remaining work seems to be in providing such a good system in a language that will appeal to programmers accustomed to working in C, C++, Java, C#, Python, Ruby... and that, I think, is the next step: to craft a credible alternative at least to C, C++, Java, and C#; even Tim Sweeney's "Next Mainstream Programming Language" might not be enough to persuade the Python, Ruby, Perl, Tcl... communities, basically because nothing is likely to be.

re:

"And making the point that saying DT languages are universally more flexible than ST languages is silly."

that's the worst point-making i've ever seen! literally:

"And a dynamically typed language doesn't mean it has the flexibility of a statically typed language. Last time I checked, neither Ruby or Python had macros."

I don't know about you, but i read that like saying that ruby/python are not as "flexible" as a statically typed language because they have no macros! Sure, because macros are a C thingy indeed! and Java features it prominently!

but that's ok. i'm done here...

that's the worst

that's the worst point-making i've ever seen! literally:

I agree. Saying that DT languages are more flexible than ST languages isn't valid.

I don't know about you, but i read that like saying that ruby/python are not as "flexible" as a statically typed language because they have no macros!

This goes back to handwaving about things like "flexibility"

Sure, because macros are a C thingy indeed!

You're joking right?

but that's ok. i'm done here...

You were done a while back.

tsc

last time i feed your troll, i promise!

"Saying that DT languages are more flexible than ST languages isn't valid."

no, you know i was talking about your silly comment that python/ruby are not as flexible because don't feature macros, that you implied are a ST thing.

and sure DT languages are far more flexible than ones than pin you down.

"This goes back to handwaving about things like 'flexibility'"

it's not handwaving, it's a realization.

now, if you don't twist your arguments once more, i won't get back to it, ok?

let's try this once again.

last time i feed your troll, i promise!

Sorry, but I had to respond to your slashdot like mentality when you say things like:

Static typing is great for tools to assist programmers who rely on heavy, cluttered languages and its tons of frameworks coping with their severe limitations..

and

And i see most practical, pragmatic folks using and advocating use of dynamic, flexible and "agile" languages and frameworks.

That handwaving troll was just classic.

no, you know i was talking about your silly comment that python/ruby are not as flexible because don't feature macros, that you implied are a ST thing.

Sarcasm and showing absurdity by showing counter absurdity seems to go right over your head.

and sure DT languages are far more flexible than ones than pin you down.

Great, more trolling and handwaving about "pin you down". I wouldn't expect less from you rmalafaia.

it's not handwaving, it's a realization.

Yeah, sure....whatever you say.

A request

Please consider taking your duel to email.

I agree.

I agree.

Sorry Ehud

I'm sorry, I was just intrigued by the double-think.

Shh, don't tell them it's statically typed

It seems that with a smart enough compiler you could have a type-inferenced Ruby. The problem now is that the Ruby interpreter doesn't even do bytecode, it just works on the AST, so that extra work isn't worth it.

But do people really care about dynamic typing in something like Ruby, or do they really care about it being fairly concise and expressive with good out-of-the-box libraries?

If there was a Ruby compiler, with a good top-level, and that did type-inferencing even on global functions/methods would people really miss dynamic typing for those corner cases that the compiler couldn't figure it out? I think a lot of people would give up those corner cases for a very speedy Ruby.

And I'm guessing that eventually YARV will do type-inferencing at the compiler/VM level anyway.

Type inference is only part

Type inference is only part of the appeal of dynamic typing. Structural subtyping is the most usually cited benefit of dynamic typing, which isn't inherent with type inference. You don't necessarily need to go static to get fast code; it's entirely possible to have a psuedo dynamic language which is on par with most lower-level code, for example.

Type inference and structural types

Structural subtyping is the most usually cited benefit of dynamic typing, which isn't inherent with type inference.

Type inference strongly relies on structural types, so you can hardly do any interesting inference for a language with subtyping if your subtyping relation is not structural.

I mention in passing that a dynamic language can hardly be claimed to have subtyping in the first place... ;-)

Are we really having this

Are we really having this discussion again?

Yokomizo's corollary

That would be Yokomizo's corollary in action :-)

That's somewhat irrelevant,

That's somewhat irrelevant, considering the post was about typing in the first place ;-)

Actually

Actually it was about the oddity of praising Haskell OCaml and Scala without admitting that they are statically type checked.

Curtis W: I had to check but

Curtis W: I had to check but I was pretty sure it was you, bytecoder. Registered to say hi (-Rebooted).

Curtis and I are both registered on Gamedev.net, where quite a large discussion on the design of a new language has begun, largely in response to Sweeney's Next Mainstream Programming Languages presentation.

Curtis has been advocating dynamic types, while I've been advocating static type systems, particularly those of Epigram and Ontic. Anyway, instead of having the discussion again, head to Why type systems are interesting.

I'd rather just keep the

I'd rather just keep the discussion here; at least then we don't have to read a hundred or so responses to keep up.

Oh, and Rebooted: just out of curiosity, what gave me away?

Hybrid Languages

Your comments about the editor and a hybrid dynamic-static system tipped me off I think.

I think this discussion would be more fruitful if it was more about how dynamic and how static a type system should be; rather that just another holy war.

I disagree that starting with a dynamic language and adding typing is the correct way of going about things. The system-wide guarantees that static typing can give you about correctness, security, dead-lock freedom, etc outweigh the localized benefits of dynamic typing. You can add features to a typed language that let you use dynamic typing where it is needed. On the other hand, you couldn't really build Haskell on top of a dynamically checked language.

What are the advantages of dynamic typing that you want? We've already talked about duck typing and lack of type annotations, and how they aren't particular to dynamic languages.

The system-wide guarantees

The system-wide guarantees that static typing can give you about correctness, security, dead-lock freedom, etc outweigh the localized benefits of dynamic typing.

It seems you're overextending a bit. Assuming you're notion of static typing includes type inference and structural subtyping, the differences between static typing and dynamic typing aren't that great. Pure dynamic typing doesn't preclude catching errors at compile time any more than non-mandatory exception handling precludes proper error recovery.

What are the advantages of dynamic typing that you want? We've already talked about duck typing and lack of type annotations, and how they aren't particular to dynamic languages.

Dynamic typing allows for the least amount of annoyances to the programmer.

A dynamic language like

A dynamic language like TCL/TK is written in C a static type checked language. One could argue that TCL/TK is mostly type checked. I wonder how many other dynamic languages are written in C, translate to C, or otherwise run on a VM written in C?

That's merely because of C's

That's merely because of C's speed. It has no bearing on this discussion and simply illustrates the fact that there aren't many performant, dynamically typed languages, common lisp being a notable exception.