Buried Treasure - No Fluff Just Stuff

Buried Treasure pdf

The trend toward dynamically typed languages is both widespread and strong. Less obvious, though, is a resurgence of interest in functional programming and functional languages. ... Haskell ... Objective CAML ... Scala ...

Is it double-think that allows the promotion of both dynamically typed languages and, by name, Haskell OCAML Scala without any acknowledgement that those powerful and efficient functional languages are ... statically type checked?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Not so sure about that

...the adoption of dynamically typed languages is widespread and strong

The whole paper smells of RoR hype more than anything else; seeming to suggest that RoR is driving this "strong and widespread adoption". Python, Smalltalk, Lisp, and other dynamically typed languages weren't on the verge of death before Ruby.

The trend I see is the realization that boilerplate sucks and reading redundant code sucks. Some people think statically-typed languages, and the first languages that pop into their head is C++, Java, C#... They like the loosey-goosy feel of duck-typing and not having to declare types.

But as we know, type-inference, structural subtyping, and having a toplevel to play with can mitigate some of these weaknesses of statically typed languages.

I've really been enjoying playing around with Nemerle lately. It seems a nice blend of OO and functional programming, with type inference, macros, variants/pattern-matching, open classes, optional python-like indentation syntax, along with the ability to use the .NET libraries. In fact, C# 3.0 seems to be going in this direction.

I guess Scala would be something similiar for the Java platform, but I don't know much about it.

not quite

But as we know, type-inference, structural subtyping, and having a toplevel to play with can mitigate some of these weaknesses of statically typed languages.

Some of the weaknesses may be taken care of, but that certainly doesn't mean such a language would be as flexible as a dynamically typed one. I've found that such principles can be applied from the other direction and yield a much more favorable result. Letting the programmer do as he wishes, performing error checking at compile time when sensible, and using an editor which actually tells you information about what's going on (e.g. types) pretty well negate most, if not all, of dynamic typing's disadvantages.

*sigh*

Apart from the irreducible fact that static typing proves the absence of classes of misbehavior before delivering the software, whereas dynamic typing can prove no such thing, leaving everything to testing, which, per Edsger Dijkstra, can only prove the presence of defects, not their absence. This distinction is only becoming more indicative of static typing's value over time, rather than less, as we see with the work on type system support for concurrency, distribution, and security: Software Transactional Memory, TyPiCal, HashCaml, Acute, Flow Caml... all of these represent statically-typed solutions to problems such as type-safe marshalling, proving the absence of deadlock and race conditions, and proving code adherence to security policies, all before the code gets to the customer, and all of which are quite literally impossible to provide a dynamically-typed analog for. Proving, as in, the code doesn't compile if it's wrong. Proving, as in: there's no need to test those features (of course, you still need to test features that can only be realized by the dynamic semantics of the language in question).

At the moment, for most software, the distinction probably isn't that important, and I've acknowledged this before. But let's be clear: static typing is becoming more important, not less, as software is becoming more concurrent, more distributed, and (it is devoutly to be hoped) more secure, not less. If the future looks grim for anything, it's dynamic typing, which is really best suited for exploratory programming in poorly-understood domains with small teams of genius-level programmers (per Marvin Minsky). "Letting the programmer do as he wishes" is a great policy for a single person working on a pet project, or a small team of genius programmers. I can tell you from about a quarter century of industry experience, it's an unmitigated disaster for the industry.

purity

Apart from the irreducible fact that static typing proves the absence of classes of misbehavior before delivering the software, whereas dynamic typing can prove no such thing, leaving everything to testing, which, per Edsger Dijkstra, can only prove the presence of defects, not their absence.

For pure dynamic typing this may be true. What I'm proposing is somewhat of a hybrid--those certain classes of errors you speak of are, for the most part, handled at compile time. The main difference is, of course, that explicit casting is not required, which is easily made up for with a decent editor which highlights possibly erroneous code. If you consider that, it comes down to one thing: who does the work. With static typing, explicit casts help notify yourself of the possibility of a runtime error; with dynamic typing, the editor does this for you.

proving the absence of deadlock and race conditions, and proving code adherence to security policies, all before the code gets to the customer, and all of which are quite literally impossible to provide a dynamically-typed analog for.

I suppose static typing allows for solving the halting problem, as well? ;-) I would like to see examples of these; I doubt they're something that couldn't be done with a hybrid system.

"Letting the programmer do as he wishes" is a great policy for a single person working on a pet project, or a small team of genius programmers. I can tell you from about a quarter century of industry experience, it's an unmitigated disaster for the industry.

The only circumstance I can imagine where dynamic typing with a decent development environment is that bad is when the programmers on said project aren't too knowledgeable in their profession, in which case you're already screwed if you try to get them to do anything moderately hard, regardless of the type system.

This Always Gets Confusing

Curtis W: If you consider that, it comes down to one thing: who does the work. With static typing, explicit casts help notify yourself of the possibility of a runtime error; with dynamic typing, the editor does this for you.

If the editor can infer the type and let me know of it, why shouldn't/wouldn't the compiler do the same thing? This is one thing I've really never understood about the dynamic typing argument: it frequently seems to come down to "let's do type inference somewhere other than the compiler." The only putative value I can see to that is if what's implicitly meant is "and let's not report an error on anything that can't be inferred," which seems to me to be worse than useless: the casual programmer is lulled into a false sense of security because the system doesn't tell them they might be talking nonsense, vs. the static typing case, which is admittedly conservative because it sometimes rejects valid programs. I think this may inform a later comment of yours, so more in a moment.

Curtis W: I suppose static typing allows for solving the halting problem, as well? ;-) I would like to see examples of these; I doubt they're something that couldn't be done with a hybrid system.

Boy, you must be new here. :-)

Acute deals with type-safe marshaling, dynamic type-and-version-safe upgrade, and several other features.

HashCaml follows up on Acute, specifically in type-safe marshaling, dealing with several constructs that Acute did not.

TyPiCal is the Typed Pi Calculus system, statically preventing deadlock and race conditions and also providing information flow analysis.

Flow Caml is all about information flow analysis, adhering to a security policy, etc.

Sure, you could implement these in a hybrid system of some kind—it's just that they'd all fall on the static side of such a hybrid, with the notable exception being that there's work to be done in the information flow realm to allow policy statements to be values as well as types.

So ultimately, yeah, we all want some kind of integration of static and dynamic typing so that we can choose our acceptable risk level in our code. Partial evaluation, dependent type systems, Ontic-like semantics... I honestly don't know which of these, or whether some combination, will yield the most value, nor do I have a really good handle on how to do any of these things (yet—I suspect that finishing working through TAPL and ATTAPL will help a lot). But I do know that the big challenges in programming today revolve around concurrency, distribution, and security, and am confident that those challenges are only completely addressable with a very rich type system, probably involving dependent types or Ontic-style semantics. I also think, though, that such a system needs to be inferred and needs to be incremental, so that it can be embedded in an IDE and stay out of the way as much as humanly possible, so that programmers can reap the benefits of interactive, incremental development and not confuse "static typing" with "declare everything, batch everything" development.

That's what I think the original comment from the article was getting at, to try to bring all of this home: it's perfectly possible to have interactive, incremental development with a statically-typed language, as I do in all of my O'Caml work. Lots of folks think that's the exclusive province of the "scripting" language or dynamically-typed languages like Lisp/Scheme, but it isn't. I think if that word gets out more, we'll see more people moving away from dynamically-typed languages to friendly statically-typed languages, rather than vice-versa. Then, hopefully, something like Tim Sweeney's ideal language will appear, and much of this debate will really become moot!

Curtis W: The only circumstance I can imagine where dynamic typing with a decent development environment is that bad is when the programmers on said project aren't too knowledgeable in their profession, in which case you're already screwed if you try to get them to do anything moderately hard, regardless of the type system.

Like it or not, most programming teams in the world aren't A-list, so yes, you really do have to think about basic competency issues. "It's not that bad in the real world" is the constant canard argument that dynamic typing fans make, and it's simply false. It's based on the inaccurate premises that all programmers are good programmers and that there's no basic difference, really, between a tool that proves the absence of undesirable properties before delivery and one that doesn't, especially when those properties are among the most complex ones we deal with in software. Even if you assume every team is an A-list team, very bright, very capable, very experienced programmers screw up shared-state concurrency all the time—it's the norm, not the exception. The situation with distribution and, especially, security is even worse. Having a compiler that rejects code that will deadlock or has race conditions, or that emits code that's at least guaranteed to throw an exception if I attempt to do type-incompatible marshaling/unmarshaling, or rejects code that violates a security policy, means that I can concentrate more of my creative energies on solving the actual problem at hand, instead of worrying about things that the computer can worry about for me. That's unbelievably valuable, and again, it's pretty obviously where things are going over the long haul—yes, the features that right now are only found in exotic-seeming languages need to be mainstreamed, careful attention has to be paid to syntax and tool support and all of the other i's and t's that go into making a real industrial language—but make no mistake, this is how the tough programming issues will be solved. After all, all I'm saying, in the end, is something like "all sufficiently complex programming issues are ultimately resolved through metalinguistic abstraction," and once a few meta-solutions are proven over time, they stop being meta and simply become new languages.

Incremental/Interactive development

also think, though, that such a system needs to be inferred and needs to be incremental, so that it can be embedded in an IDE and stay out of the way as much as humanly possible, so that programmers can reap the benefits of interactive, incremental development and not confuse "static typing" with "declare everything, batch everything" development.

That's what I think the original comment from the article was getting at, to try to bring all of this home: it's perfectly possible to have interactive, incremental development with a statically-typed language, as I do in all of my O'Caml work

I know you're a big OCaml fan, so I was wondering if you had tried out F#? The F# addin for Visual Studio is very sweet. You get all the intellisense stuff, plus you can just write down any snippet of code you want in the editor, hit ALT-; for a one liner and have it compiled and evaluated, or highlight a block of code and hit ALT-Enter and have it compiled and evaluated.

Or you can just go down to the REPL window and type out some code and ;; to get it evaluated.

Wrong Platform :-)

If I were a Windows guy, I'd certainly take a look at F#. :-) I do appreciate the functionality you're describing—it's just that I appreciate it with O'Caml and Tuareg Mode in GNU EMACS.

If the editor can infer the

If the editor can infer the type and let me know of it, why shouldn't/wouldn't the compiler do the same thing?

Because the editor can notify you better. Which would you prefer, highlighted warnings with popup explanations before you compile, or a terse text log which swamps you with all the warnings at once?

The only putative value I can see to that is if what's implicitly meant is "and let's not report an error on anything that can't be inferred," which seems to me to be worse than useless: the casual programmer is lulled into a false sense of security because the system doesn't tell them they might be talking nonsense, vs. the static typing case, which is admittedly conservative because it sometimes rejects valid programs.

What, you've never heard of warnings before? It's not as though the editor is going to go through all that work to throw the information it got away. In any event, consider the differences that are left when comparing a hybrid dynamically typed system with a structurally subtyped, type inferred static system. The only difference is in the handling of possibly erroneous situations. In dynamic typing, the editor alerts you to this fact, whereas in a static system you have to notify yourself with a cast. Furthermore, this doesn't even guarantee correctness; if, at run time, the given object isn't compatible, you've got an error on your hands that might've gone unchecked.

More False Dichotomies to Slay :-)

Curtis W: Because the editor can notify you better. Which would you prefer, highlighted warnings with popup explanations before you compile, or a terse text log which swamps you with all the warnings at once?

This is exactly why I urge the development of incremental compilers: it's not a batch world anymore, and hasn't been for about a decade. Modern machines spend the overwhelming majority of their time waiting for their users—time that could be productively spent dealing with a developer's actions in real-time. There are incremental parsing algorithms and incremental type-inference algorithms for some type systems (typically those with "principal typings" vs. "principal types," surely a confusing state of affairs). So what's really needed isn't even strictly in the realm of language design, but rather in better compiler stage separation and the provision of an event-driven interface. The point, of course, is that this has nothing whatsoever to do with being either statically- or dynamically-typed. I almost get this with O'Caml and Tuareg Mode, but not quite: the "show type at point" feature of Tuareg Mode only works if the code in question has been compiled with the -dtypes option, i.e. it's still a batch process. That's something the O'Caml team could address, however, if they chose to.

Curtis W: What, you've never heard of warnings before? It's not as though the editor is going to go through all that work to throw the information it got away.

Nor would the compiler, but remember, I'm talking about that which can't be inferred in even the most powerful known type systems. After all, that's the real advantage of dynamic typing; it allows a programmer to express things that intuition says are safe (or at least the risk of runtime failure is acceptable) that all known static type systems would reject.

Curtis W: In any event, consider the differences that are left when comparing a hybrid dynamically typed system with a structurally subtyped, type inferred static system. The only difference is in the handling of possibly erroneous situations. In dynamic typing, the editor alerts you to this fact...

OK, I'll bite: how? Is the editor going to do type inference (let's keep it simple and say "in the Milner-Damas system like the ML family") and then tell you about each and every term that it can't type that way? Yikes. I have to be honest: I use such a type-inferring-but-warning compiler. It's called Steel Bank Common Lisp, and I pay almost no attention to its warnings because they are so voluminous and, um, underinformative. I worry that this would be the case for any such language, whose expressive power far outstrips the power of its type system.

Curtis W: ...whereas in a static system you have to notify yourself with a cast.

Except in actual statically typed languages where unsafe (that is, down) casts don't exist. :-)

Curtis W: Furthermore, this doesn't even guarantee correctness; if, at run time, the given object isn't compatible, you've got an error on your hands that might've gone unchecked.

Correct—that's exactly why languages such as O'Caml don't have such casts.

I'd be interested to know

I'd be interested to know what "intuitively safe" code you have that isn't typeable under, say, the Calculus of Inductive Constructions. Granted it's not particularly pragmatic as a programming language...

Of Course...

...I don't have any: I don't trust my intuition as much as I trust even Milner-Damas, which is a big reason why I now do all my recreational programming in O'Caml. :-) But the argument for dynamic typing I've heard repeatedly is that the programmer knows what he/she is doing and the type system is just "getting in the way." I've not seen any non-contrived examples—the one that generally is offered is heterogeneous lists, which first of all is a mechanism, not a use-case in a domain; and secondly is pretty easy to implement in Standard ML, O'Caml, or Haskell.

I can find plenty of

I can find plenty of examples that I can't type in a language that's production-ready, although most of the really fun ones . I picked the CIC for a reason though - if I can't type it given pretty much the entirety of constructive mathematics to work with, I'm not sure I should be trusting my intuition (if I have a firmer reason for it, that's different - systems that aren't supposed to terminate, for example).

A dynamic typing thingy

Here's a non-trivial use of dynamic typing from my work.

The platform is HTML+JavaScript. The application is a touchscreen interface for a kiosk. The interface naturally splits into things called "features" that are organized into "pages". Example feature: "screensaver kicks in after n minutes of inactivity". Another example feature: "the back button leads somewhere". Another example feature: "a list of songs is displayed on the screen". Different pages are just different combinations of features.

The common interface of all features consists of 2 methods: show and hide. Hide has no arguments. Show has one argument, the type of which varies from feature to feature. Example: the back button's argument is the name of the page the back button leads to. The screensaver's argument is the inactivity time. And so on.

All features and all pages have names. The data structure that describes the whole interface can be schematically described as: map<pageName, map<featureName, argument>>.

There might be a way to statically type this, but the DT solution works just fine :-)

Closures to the rescue

Vladimir Slepnev: map<pageName, map<featureName, argument>>

Of course, the obvious way to circumvent this (and actually the more natural solution IMHO) is to use closures and just have a map<pageName, unit->T> (where T is whatever return type the show function has). Each closure just applies the show function from the respective feature to its argument.

DRY

This violates the "don't repeat yourself" rule :-)

What's repeated?

How is that?

Aah

Sorry, I was a bit careless.

You must've meant map<pageName, set<unit->T>>. This doesn't violate DRY, but it also doesn't contain all the required information. We don't know which "hide" functions to call - only "show". And if we fix it in the obvious way, we'll start to violate DRY :-)

Mh...

You are right, I missed that it should be a set. Sorry for that.

But I still don't believe there is a problem. You need hide as well. Fine: then store a set of pairs of closures instead - in other words, a set of objects, each having a show and a hide method (both without arguments!). The show argument becomes a constructor argument for the respective feature classes (or functions creating features). This should be no more verbose to express.

One advantage of this solution is that it scales to more heterogeneous cases, where not all features by accident happen to have exactly one argument for show. ;-)

OK

Aah, I see.

bind :: Feature<T> → T → BoundFeature

map<pageName, set<BoundFeature>>

Is this what you meant? Good! (Sorry for the crazy syntax; I'm just trying to be understandable.)

As for scaling, I have no problem supplying a 2-element array as an argument to show. Dynamic typing, you know :-)

Almost what I meant

...except that I wouldn't even bother with the bind function, and instead always construct appropriate BoundFeatures right away. What would you ever need a Feature<T> value for?

To pass different arguments

You might say that a Feature<T> is a constructor :-)

Another observation

I sense an interesting trend here. Your statically-typed solution is composed from values with matching types. My dynamically-typed solution, OTOH, contains names of stuff, not stuff itself.

I like names. It seems to me that dynamically-typed systems encourage giving names to stuff. This may be one of the core reasons I like DT :-)

The Web likes names, too; and it's dynamically typed - an URL may point to anything.

Naming stuff

Unless I'm missing something obvious, the DT solution doesn't name the type, just the names of the containers. ST gives can name both the type and the containers. So if you like naming....

Hm

That's a different meaning of "naming" :-) My "naming" maps Strings to Things. Let me explain.

To my subjective eye, types in ST languages don't even have names. Let's say we want to read a class name from a config file and default-construct an object of this class (say, a user plugin to our software). It's easy in Java: we have Class.forName(String). Even easier in Ruby. Note the user didn't have to "register" stuff somewhere, or otherwise worry about naming; it already exists.

Another fun topic is RMI naming (java.rmi.Naming). I can bind any object to any name, then call it from another machine. Remote calls are one of those cases where you absolutely need dynamic typing. Remote calls can't be verified statically, period. (Compile the client with an old version of the server interface, then use it with the new version. Crash.)

I like the freedom to name and look up stuff that DT languages give me. YMMV.

Well, Alice has it's packages

that provides for remote components and calls, so I'm sure Andreas can clues us in about how to accomplishes this within the framework of static langauges (link to Andreas paper about The Missing Link - Dynamic Components for ML.

Runtime type information

It's easy in Java: we have Class.forName(String).
...
I like the freedom to name and look up stuff that DT languages give me. YMMV.

Despite the holes in its type system, Java is ST by any normal definition, so the use of Java as an example of a language in which this is easy illustrates that this is not an ST issue. C++ also supports access to runtime type information (RTTI).

The issue is that in a language that does significant static processing, there are certain kinds of names that are only needed by the language implementation itself at compile time. Type names in particular are a common example of this. The choice to keep information about types at runtime is then completely optional, and some languages choose to keep it around, whereas others don't.

ST languages often drop that information by default, on the principle that programs should only pay for the features they use. But there's nothing intrinsic about ST languages that prevents having runtime type information, and there's even nothing intrinsic about DT languages that requires having that information (other than the type tags necessary for runtime typechecking). The same goes for having runtime access to other kinds of names, such as the names of lexical variables.

Remote calls

Remote calls can't be verified statically, period. (Compile the client with an old version of the server interface, then use it with the new version. Crash.)

Depends. You can dynamically check types - once! - before you perform the first call, i.e. when you establish the connection. Then the calls themselves can be statically type-safe. Since Chris mentioned Alice ML: this is what it does with its proxy mechanism.

So yes, you need a small amount of dynamic type checking for open distributed programming, but it's much smaller than you think - and it does not preclude having a statically typed language.

Regarding names: there are symbolic names and there are logical names (ids). I suppose you mean the former. Problem with them is that they always include the potential for name clashes.

Naturally...

...I was going to point out that type-safe distribution is exactly what Acute and HashCaml do, so "Remote calls can't be verified statically, period" is rather obviously too strong!

Well...

AFAIK, Acute's inter-process communication consists only of IO.send/receive, and those actually perform a dynamic type check for every single transmission. Similarly with HashCaml. So I'm afraid they are not really proper counter examples to Vladimir's argument that type-safety cannot be ensured statically for inter-process I/O.

Senses of "Static"

It depends upon how you define your terms. If, in fact, I can't crash the runtime by using an old version of a client against a new version of a server; if, in fact, attempting to compile/link old and new versions of interfaces and their implementation fails; if, in fact, I'm guaranteed to get an exception when the distributed code mismatches, I seem to have landed at a very important sweet spot with respect to both the static case (compile/linking of incompatible services fails) and the dynamic case (the code over the network is guaranteed not to "go wrong," i.e. end up in an undefined state). That certainly addresses the "old version + new version = crash" concern, which is all one can ask for.

Linking vs IPC

I agree with everything you say, except for the last half-sentence ;-). Because dynamic linking and inter-process communication are usually different things, operating on a different level of granularity. And for the latter I indeed ask for more than per-transmission checks (which just form the distributed counterpart to a dynamically checked language).

On second thought, given that the context of your post was IPC alone and Vladimir's specific claim, I also disagree with your first sentence. :-)

What about MiniKanren in Haskell?

I'm able reproduce MiniKanren in Haskell if I restrict the values the logic system deals with to a single ADT, but this isn't very satisfactory. It's nothing like programming in MiniKanren on top of Scheme. I don't know how to produce a logic system in Haskell that feels like a typed MiniKanren.

Now, a logical system can be typechecked. I bet a not-too-terribly complicated preprocessor could feel like "MiniKanren on Haskell." However, this is all way more complicated than what Scheme offers.

Perhaps any decidable type system prohibits useful means of expression... even though those means of expressions could be typed by a different, decidable, type system. Just a possibility running through my head.

Kanren was influenced

Kanren was influenced (inspired?) by the work of Spivey and Seres on emebdding logic programming in Hasekll, so I suggest you look at this work if this direction interests you (start here and follow the links).

The point, of course, is

The point, of course, is that this has nothing whatsoever to do with being either statically- or dynamically-typed.

It most certainly does. I would go so far to say that in any structurally subtyped language it's going to be a bit more difficult to program in compared to a nominally typed language without help from the programming environment.

Nor would the compiler, but remember, I'm talking about that which can't be inferred in even the most powerful known type systems.

I'm not sure what you mean. The type of any statement can be inferred with the presence of compound types; it just means that some processing needs to be done at run time, which would have to be done manually by the programmer had the compiler not done it.

OK, I'll bite: how? Is the editor going to do type inference (let's keep it simple and say "in the Milner-Damas system like the ML family") and then tell you about each and every term that it can't type that way? Yikes.

Still not sure what you mean. The basic idea is to infer the types and check for the situations which might cause a run time error due to incompatible types. Syntax errors and certain other errors, e.g. guaranteed erroneous code, would be handled at compile time, as well.

Correct—that's exactly why languages such as O'Caml don't have such casts.

Ah, yes. I'd forgotten about forced checks. They might satisfy the compiler, but the presence of a check doesn't necessarily mean it does any good, as can be seen in Java exceptions. Because of this, it can't really be considered a benefit. Once you remove that, it decays into the implicit vs explicit casting argument, which I've already addressed.

Feels like it'll explode

Nor would the compiler, but remember, I'm talking about that which can't be inferred in even the most powerful known type systems.

I'm not sure what you mean. The type of any statement can be inferred with the presence of compound types; it just means that some processing needs to be done at run time, which would have to be done manually by the programmer had the compiler not done it.

Though the inferencer can always give you a "safe" answer, the precision of the type may be too poor to allow the kinds of static checks you'd like to perform. A statically typed language enforces a certain amount of modularity and abstraction in order to ensure that it will always be able to prove the absence of a certain set of errors.

You seem to be of the opinion that explicit casts are useless. They aren't; they provide a static checker with information about the programmer's intent. Without them, a checker would warn about *every* mismatched assignment statement. An explicit cast is the programmer saying: "I know these don't match up, but do it anyway". Of course, if you have to insert explicit casts all over the place (as with pre-generics Java), then there's something wrong with the type system.

OK, I'll bite: how? Is the editor going to do type inference (let's keep it simple and say "in the Milner-Damas system like the ML family") and then tell you about each and every term that it can't type that way? Yikes.

Still not sure what you mean. The basic idea is to infer the types and check for the situations which might cause a run time error due to incompatible types. Syntax errors and certain other errors, e.g. guaranteed erroneous code, would be handled at compile time, as well.

I don't think you can write whatever you want and expect a sound static checker verify the properties you're interested in without exploding. Change one line and the checker may suddenly be unable to make any useful assertions about a large chunk of your program. The restrictions imposed by a statically typed language are present primarily to avoid this situation.

Sure, static typing is often more difficult but I think the benefits are worth it. Do you have any particular examples of things that are unreasonably difficult in a modern static type system? What about languages that have a dynamic invocation "escape hatch", like VB.Net and Boo?

Though the inferencer can

Though the inferencer can always give you a "safe" answer, the precision of the type may be too poor to allow the kinds of static checks you'd like to perform.

The precision of the type depends on how many different types the variable can hold. I fail to see how this would not be useful when performing static checks.

A statically typed language enforces a certain amount of modularity and abstraction in order to ensure that it will always be able to prove the absence of a certain set of errors.

This has nothing to do with static/dynamic and everything to do with inferred/explicit typing.

You seem to be of the opinion that explicit casts are useless. They aren't; they provide a static checker with information about the programmer's intent.

The problem being, the compiler can usually guess my intent pretty well by how I use the variable.

Without them, a checker would warn about *every* mismatched assignment statement. An explicit cast is the programmer saying: "I know these don't match up, but do it anyway". Of course, if you have to insert explicit casts all over the place (as with pre-generics Java), then there's something wrong with the type system.

At the beginning of this paragraph you talk about how highlighting every single mismatch would simply swamp the programmer. Of course, you realize that the places where the editor would highlight mismatches are the same where explicit casts would be required otherwise, right?

I don't think you can write whatever you want and expect a sound static checker verify the properties you're interested in without exploding. Change one line and the checker may suddenly be unable to make any useful assertions about a large chunk of your program. The restrictions imposed by a statically typed language are present primarily to avoid this situation.

Ah, but you see, it's not that you can't give the compiler extra information with a dynamic language, it's that it's optional. Most of the time, you don't need to use it; if you do, however, it's there for you.

Sure, static typing is often more difficult but I think the benefits are worth it. Do you have any particular examples of things that are unreasonably difficult in a modern static type system?

Not necessarily difficult, but certainly cumbersome.

edit: terminology

A quick nitpick: inferred vs

A quick nitpick: inferred vs nominal isn't a relevant dichotomy. You might be looking at one or both of nominal vs structural (the answer to which is: both!) or inferred vs explicit/manifest.

Whoops. My mistake, I meant

Whoops. My mistake, I meant explicit :-)

At the beginning of this

A statically typed language enforces a certain amount of modularity and abstraction in order to ensure that it will always be able to prove the absence of a certain set of errors.

This has nothing to do with static/dynamic and everything to do with inferred/explicit typing.

A statically typed language is designed with the type checking algorithm in mind. Even systems with type inference don't allow you to write whatever you want because the language designers knew they couldn't write a type checker to handle it.

When you design a language without any direct consideration for static verification (which is, I believe, how dynamic languages are designed), it's probably not going to be easy to check. I'm guessing you'll end up with a time or state space explosion.

BTW, I think our definitions of static vs. dynamic may not match up and we may be talking past each other here. What do you consider to be the difference between a statically typed language with type inference and a dynamically typed language?

At the beginning of this paragraph you talk about how highlighting every single mismatch would simply swamp the programmer. Of course, you realize that the places where the editor would highlight mismatches are the same where explicit casts would be required otherwise, right?

Yes. Maybe "swamp" was not the right word :) An explicit cast is a way for the programmer to tell the compiler (and other readers of the code): "yes, these types aren't implicitly compatible, but I know what I'm doing". Without them, the editor/compiler would always show you every type mismatch. Explicit casts are a way to suppress false positives.

When you design a language

When you design a language without any direct consideration for static verification (which is, I believe, how dynamic languages are designed), it's probably not going to be easy to check.

It's not going to be easy, but that's because you're moving something that's normally done by the programmer into the compiler/editor. To be honest, I'd rather have the development environment be more complex if it allows me to write more concise, flexible code.

Yes. Maybe "swamp" was not the right word :) An explicit cast is a way for the programmer to tell the compiler (and other readers of the code): "yes, these types aren't implicitly compatible, but I know what I'm doing". Without them, the editor/compiler would always show you every type mismatch. Explicit casts are a way to suppress false positives.

The way I see it, highlighting and explicit casting have the save visual impact, the only difference being the the latter is done manually by the programmer whereas the former is done automatically by the editor.

And in having been done

And in having been done manually by the programmer, a cast shows intent - "I intend to treat this String I read in as a Name", for example.

Sort of. It does show

Sort of. It does show intent, but it doesn't add anything that couldn't be discerned by the source code or types, e.g:

type Name = [char]

name = Name(in.readline())

adds nothing over

name = in.readline()

In our ideal editor, the latter's type will obviously be determined to be [char] and if it weren't it would be easily noticeable by the programmer.

A fan of

A fan of pointless^Wpoint-free style might not bother binding the value to a variable called name though. What then?

You look at how it's used?

You look at how it's used.

Which puts you back to

Which puts you back to attempting to assign intent when you don't actually know. Next thing you know, you're forced to conclude that either you're getting it wrong or that the programmer actually intended to dereference that null...

Wait, what?

Wait, what? I'm thinking this is a somewhat bad example. From what I can tell, a Name is just a string, so I see no reason why you actually need to label it a Name. If a Name actually has some other properties, it's time to create a new time.

Even if it's just a string...

there are reasons to wrap it. Documentation first and foremost, but also future-proofing and encapsulation of published interfaces. In high-risk environments with long depreciation cycles, that sort of things is often worthwhile (particularly if a refactoring browser will do the encapsulation for you).

I had a "fun" case with page

I had a "fun" case with page names in my wiki clone. One of those lessons you hopefully only need learn once...

Future-proofing? Tell me,

Future-proofing? Tell me, how are you going to future proof a string? Wrap it in get/set methods and call it "encapsulated"? Future-proofing implies the presence of properties other than just being a string, which is an entirely different situation. If you want an example of this, consider a Name which consists of any alphanumeric string:

alphanumeric = regex("[a-zA-Z0-9]+").match
type Name = [char](alphanumeric)

In this case, it makes sense to use the type in, e.g., argument lists.

It's not an entirely

It's not an entirely different situation, because in practice there aren't any strings with no additional properties - just times where you don't care what they are.

As soon as the contents of

As soon as the contents of strings have specific meanings, it can make sense to assign more specific types than simply String - it can stop you mixing up term- and type-level identifiers in a typechecker, for example.

Not quite

The latter doesn't show intent, it shows either intent or programmer error. Could be either one, and there's no way to determine which without examining a lot of context. The (admitted) redundancy and verbosity of the first carries an implicit "Yeah, I really meant this" which the second just doesn't have. Fine if you and your team don't make many errors, but I'm just not that good.

Intent includes programmer

Intent includes programmer error.

I'm confused...

Did you just define intent to include carelessness, ignorance, and confusion (the root causes of most programmer error)? If so, you've probably stretched the word far beyond what most of the participants here are used to.

Did you just define intent

Did you just define intent to include carelessness, ignorance, and confusion (the root causes of most programmer error)?

That's what I said, yes. Consider a casting example, which supposedly only shows "intent":

char* zero = (char*)0;
puts(zero);

This obviously shows that the programmer is fully aware that he's casting an int to a char--intent, but it also shows the misconception that said operation will return a string "0". Obviously this is a blatent error, an error, unknowingly or not, that shows intent.

And it shows intent

And it shows intent regarding char * - the problem is that char * may or may not refer to an array and thus a string. The problem is that the type system has left the declaration of intent ambiguous.

Programmer error?

Well, breaking it down into a simplistic two camps framework, we have:

  • What the customer thinks he needs
  • What the programmer thinks the customer needs
  • What the programmer thinks the program does
Any one of these three things can be in error. The customer doesn't really comprehend what they need. The programmer has very little inkling of what the customer wants. And the programmer doesn't fully understand how the software works. But they are three different classes of errors. I find it silly to say that because the first two are very hard to get right, that we shouldn't worry about the third class of errors. If the programmer can't get the program right, it's pretty irrelevent whether the customer and programmers are in complete agreement about the task.

So, yes, the programmer may have an error in undertstanding the requirements. But once the programmer decides upon a course of action, then that programmer dang well ought to know how the software meets their conception of the requirement. Adding an extra layer of uncertainty doesn't make things more flexible.

As a programmer, I generally classify bugs as those things that a program does that I can not understand or explain - i.e. the program acts unexpectedly. Of course, the customer may consider things they do not expect as bugs, but that is a different story.

[Edit Note: This response is probably only loosely related to the current thread. Just been meaning to express my opinion that the belief that because customer specs are so fluid, that program verification is not helpful].

Point of Divergence

Curtis W: It's not going to be easy, but that's because you're moving something that's normally done by the programmer into the compiler/editor. To be honest, I'd rather have the development environment be more complex if it allows me to write more concise, flexible code.

This is where I think we're talking past each other: dynamic languages aren't "moving something that's normally done by the programmer into the compiler/editor." Quite the opposite! Typically (and I really should write "without exception that I'm aware of," because it's true of at least Scheme, Common Lisp, Perl, Python, Ruby, and TCL), dynamically-typed languages make it extremely difficult to do type inference even in a relatively simple type system such as Milner-Damas, let alone a richer type system such as, e.g. GHC 6.4's with GADTs and so forth. The type checking generally takes place in the programmer's head and with what checking you get at runtime, and I don't know of any runtime checking in any of those languages that's getting anywhere near what, e.g. GADTs do.

So the assertion that you can have a dynamic language with an external editor that somehow "shows you where you'd need casts" is a nice idea in theory, but the only practical example of it I'm aware of is DrScheme's MrFlow, which seems to be perpetually "coming soon," although once we had MrSpidey. It would be nice to compare MrFlow's inferred types with the types inferrable by GHC 6.4 or later, to try to get to a reasonable apples-to-apples comparison.

But clearly, we agree on the desirability of sophisticated environments that "allow us to write more concise, flexible code." It just might be surprising how close you can get to that in a statically-typed language. Consider the venerable subject-observer pattern. Here it is, in its entirety, in O'Caml:

class ['O] subject =
object (self : 'selftype)
  val mutable observers : 'O list = []
  method add_observer obs = observers <- obs::observers 
  method notify (message : 'O -> 'selftype -> unit) =
    List.iter (fun obs -> message obs self) observers
end

class ['S] observer =
object
end

Even if you don't know O'Caml, you can probably intuit what's going on: there's a "subject" class that's parameterized by the observer type. It maintains a list of observers. It has a method to add an observer to the list, and a method to notify all observers of something. That "something" is a function taking an observer and a subject as parameters. Then there's an observer class, which is parameterized by the subject type. By default, it does nothing.

Here's an example of its use:

class ['O] window =
object (self)
inherit ['O] subject
  val mutable position = 0
  method move d = position <- position + d; self#notify (fun x -> x#moved)
  method draw = Printf.printf "Position = %d\n" position
end

class ['S] manager =
object
inherit ['S] observer
  method moved (s : 'S) = s#draw
end

So now we have a window class and a manager class. Whenever a window is moved, it notifies its observers by calling their "moved" method. The manager's "moved" method calls the "draw" method on the window. Let's test it:

# let w = new window in w#add_observer(new manager); w#move 5;;
Position = 5
- : unit = ()

Cool, but it's not apparent at this point why this is cooler than the pattern in Scheme or Smalltalk or whatever. So let's mess around a bit.

class ['O] no_draw =
object (self)
inherit ['O] subject
  val mutable position = 0
  method move d = position <- position + d; self#notify (fun x -> x#moved)
end

Now let's try to use it:

# let w = new no_draw in w#add_observer(new manager); w#move 5;;
This expression has type
  (< draw : 'b; .. > as 'a) manager = < moved : 'a -> 'b >
but is here used with type < moved : 'c no_draw -> unit; .. > as 'c
Type 'a is not compatible with type
  'c no_draw =
    < add_observer : 'c -> unit; move : int -> unit;
      notify : ('c -> 'c no_draw -> unit) -> unit >
Only the first object type has a method draw

In my terminal window, the expression "new manager" is underlined, so it's clear that it's what's being referred to by "This expression." As you can see, the compiler gave us a very detailed error, and was kind enough to sum it up in the last line: "Only the first object type has a method draw." Similarly:

# let w = new window in w#add_observer(new observer); w#move 5;;
This expression has type 'a observer but is here used with type
  < moved : 'b window -> unit; .. > as 'b
Only the second object type has a method moved

We can't just stick a plain ol' observer on a window, because the plain observer doesn't have a "moved" method.

So with 10 lines of extremely simple code we have the subject-observer pattern, with the benefit traditionally ascribed to a dynamic implementation (you can put any "type" of observer on the subject that you want as long as the observer knows how to respond to the notification; the subject and observer can evolve completely independently of each other) but with the added benefit that if there is a mismatch between the methods supported by the subject and observer, the code will not compile, vs. experiencing a "method not found" exception at runtime.

This is where I think we're

This is where I think we're talking past each other: dynamic languages aren't "moving something that's normally done by the programmer into the compiler/editor." Quite the opposite! Typically (and I really should write "without exception that I'm aware of," because it's true of at least Scheme, Common Lisp, Perl, Python, Ruby, and TCL), dynamically-typed languages make it extremely difficult to do type inference even in a relatively simple type system such as Milner-Damas, let alone a richer type system such as, e.g. GHC 6.4's with GADTs and so forth.

If you give me an example I will gladly show you how it would be done in a hybrid system.

class ['O] subject =
object (self : 'selftype)
  val mutable observers : 'O list = []
  method add_observer obs = observers  'selftype -> unit) =
    List.iter (fun obs -> message obs self) observers
end

class ['S] observer =
object
end

type Observer(method) = any(x => hasattr(x, method))
def notify(objects:[Observer(method)], method, *args):
    for object in objects:
        getattr(object, method)(*args)

Mine's shorter ;-) Oh, and because I know you're thinking it: it's completely possible to check this at compile time. The compiler can and will perform the type checks (hasattr(x, method)) at compile time in the same situations as yours. The difference is, of course, that if mine can't determine the type at compile time, it assumes the programmer knows what he's doing and just inserts a check in case he doesn't. Yours, on the other hand, assumes the programmer is wrong and requires that he insert the proper checks, whether they're actually helpful or not depends on how responsible the programmer is, just like in my system. Now, I don't know about you, but I'd think having a programming language that assumes I know what I'm doing is much less annoying than one that assumes I'm always wrong. It's also worth noting: I'm advocating designing for responsible, knowledgeable people, not people that are perfect. Mistakes are inevitable and I fully recognize the fact that this needs to be addressed when designing a language, but I'm definitely against assuming everything is a mistake when it's not 100% certain that it is.

It's Hard to Compete...

...with code written in a language that doesn't exist, so...

Curtis W: Oh, and because I know you're thinking it: it's completely possible to check this at compile time. The compiler can and will perform the type checks (hasattr(x, method)) at compile time in the same situations as yours. The difference is, of course, that if mine can't determine the type at compile time, it assumes the programmer knows what he's doing and just inserts a check in case he doesn't.

...shall we say, assumes facts not in evidence. :-)

Curtis W: Yours, on the other hand, assumes the programmer is wrong and requires that he insert the proper checks, whether they're actually helpful or not depends on how responsible the programmer is, just like in my system.

Huh? The O'Caml "subject" and "observer" classes literally assume nothing (type them into the O'Caml toploop and you'll find that the inferred types contain no constraints), and the examples only contain code that's actually necessary to address the domain (type them into the O'Caml toploop and you'll find that the inferred types only constrain the objects to have the methods used in the code, and no more). There are no casts, no coercions, and certainly no checks, proper or otherwise, inserted. All that happens is that the compiler prevents me—absolutely—from mixing subjects and observers that can't actually communicate with each other.

Curtis W: Now, I don't know about you, but I'd think having a programming language that assumes I know what I'm doing is much less annoying than one that assumes I'm always wrong.

I personally prefer "trust, but verify." The O'Caml subject/observer implementation (which, BTW, isn't mine; it's straight out of "On the (un)reality of Virtual Types") is very nice: as far as the "subject" class is concerned, the "message" can be any function that takes an observer and a subject as parameters—and it keeps working no matter whether you subclass "subject" or "observer" or not, in parallel or not, as long as, when you compile the add_observer call, each class has the method(s) that are called by the other class. End of story. This is vastly preferable to potentially delivering code to the customer that throws a "method not understood" exception!

Curtis W: Mistakes are inevitable and I fully recognize the fact that this needs to be addressed when designing a language, but I'm definitely against assuming everything is a mistake when it's not 100% certain that it is.

Me too, which is why I prefer languages that, to the greatest extent possible, check for actual mistakes, vs. either of the undesirable cases of:

  1. Assuming that what you want to do is a mistake (in fairness, most popular statically-typed languages suffer from this to an appreciable degree)
  2. Letting you actually make a mistake and making your users pay for it

Huh? The O'Caml "subject"

Huh? The O'Caml "subject" and "observer" classes literally assume nothing (type them into the O'Caml toploop and you'll find that the inferred types contain no constraints), and the examples only contain code that's actually necessary to address the domain (type them into the O'Caml toploop and you'll find that the inferred types only constrain the objects to have the methods used in the code, and no more). There are no casts, no coercions, and certainly no checks, proper or otherwise, inserted. All that happens is that the compiler prevents me—absolutely—from mixing subjects and observers that can't actually communicate with each other.

I'm talking about the situations in which it's not possible to determine at compile time whether a given variable fits the requirements, e.g:

def create_compatible():
    event = ( => out.print("compatible"))
    return None::[event=event]

def create_incompatible():
    return None

type_map = {0:create_compatible, 1:create_incompatible}
observer = type_map[in.readline().to_int()]()
notify(observer, "event")

In that example, there are several potential points of failure that aren't handled--the most obvious being that the observer may or may not have the correct event. The other, less obvious points of failure are that the user might not input an integer or it might not be in the correct range. If you were to write this in a statically typed, non-casting programming language, you'd end up with something like this:

def create_compatible():
    event = ( => out.print("compatible"))
    return None::[event=event]

def create_incompatible():
    return None

type_map = {0:create_compatible, 1:create_incompatible}
input = in.readline()
if input.is_int():
    number = input.to_int()
    if number in [0..2]:
        observer = type_map[number]()
        if hasattr(observer, "event"):
            notify(observer, "event")
        else:
            raise "Observer doesn't have an 'event' method"
    else:
        raise "Number not in range [0..2]"
else:
    raise "Input is not integer"

Does this look like any other languages that you know that force you to do things? I don't know about you, but this is eerily close to Java and exceptions. The only way to avoid this fate in a statically typed language is to use casts, which are no better then just using dynamic typing.

Outside World

Let's stipulate, please, that the point at which a language has to communicate with the world outside the runtime is the point at which you have to have scaffolding to ensure that the interaction will be successful or somehow signal failure. Again, FFIs, marshaling/unmarshaling, and some categories of I/O all obviously exist at this intersection. These are interesting (as my posts elsewhere about Acute, Saffire, HashCaml, etc. attest), but they have nothing to do with the 90%+ of any given program of any size that aren't about FFIs, marshaling, or I/O.

Also, please do me a favor: if you're going to write code to demonstrate counterexamples, please use a real language: it's spectacularly unlikely that your intuition as to what can be done in a language is sufficiently accurate for claims about your hypothetical examples to be correct unless you've implemented half a dozen languages or so—at the very least, you should have worked through "Lisp In Small Pieces" and "Types and Programming Languages" if you haven't yet implemented any languages yourself. You might also wish to learn O'Caml, at least, since your earlier comments about the subject/observer pattern that I posted were incorrect.

Let's stipulate, please,

Let's stipulate, please, that the point at which a language has to communicate with the world outside the runtime is the point at which you have to have scaffolding to ensure that the interaction will be successful or somehow signal failure. Again, FFIs, marshaling/unmarshaling, and some categories of I/O all obviously exist at this intersection. These are interesting (as my posts elsewhere about Acute, Saffire, HashCaml, etc. attest), but they have nothing to do with the 90%+ of any given program of any size that aren't about FFIs, marshaling, or I/O.

That's true, but it doesn't negate my approach in the slightest.

Also, please do me a favor: if you're going to write code to demonstrate counterexamples, please use a real language: it's spectacularly unlikely that your intuition as to what can be done in a language is sufficiently accurate for claims about your hypothetical examples to be correct

Actually, you're right. I suppose O'Caml doesn't force you to provide an alternative, but it still requires you to use some sort of casting or similar method.

No, On Two Counts

Curtis W: That's true, but it doesn't negate my approach in the slightest.

True: both dynamically-typed and "hybrid," which I take to mean "dependently-typed," languages have interesting stories to tell around the issues of safe I/O.

Curtis W: Actually, you're right. I suppose O'Caml doesn't force you to provide an alternative, but it still requires you to use some sort of casting or similar method.

I'm unsure of the context here. If you mean doing I/O in, e.g. O'Caml and ensuring that the values brought into the runtime are of the types ascribed to them at the time of input, then you're almost right: you have to construct values of a type defined to the compiler from some other "dynamic" type (a string to parse, a byte-array to munge, etc.) For example, here's a function to read a 32-bit integer in little-endian format from an input stream in O'Caml:

(** This function reads an int32 that is encoded in little-endian order from the passed in_channel.
    @author Paul Snively
    @param in_chan in_channel
    @return int32 *)
let input_int32_le in_chan =
  let aux le =
    if String.length le <> 4 then 
      raise (Invalid_argument "int32_from_little_endian");
    let d3 = of_int (Char.code le.[0])
    and d2 = of_int (Char.code le.[1]) 
    and d1 = of_int (Char.code le.[2]) 
    and d0 = of_int (Char.code le.[3]) in
      (logor (shift_left d0 24) 
	 (logor (shift_left d1 16) 
            (logor (shift_left d2 8) d3))) in
  let buf = String.create 4 in
    really_input in_chan buf 0 4;
    aux buf

There's one visible failure case (aux explicitly checks the string length and raises an exception if it's wrong) and at least a couple of invisible ones (String.create can fail, really_input can fail in a couple of ways). Of course, what I imagine you're really referring to is the fact that, given an arbitrary input stream at an arbitrary position, there's no way to ensure that the next four bytes are, in fact, intended to be a 32-bit integer in little-endian order. That's the part that we appear to agree to stipulate—in this case, you can see the liberal application of type constructors taking some other type. That's not the same as casting—these operations are guaranteed to be safe—but there's a clear demarcation between the world of bytes and the world of int32s here.

If, however, that isn't what you're referring to, then I'm at something of a loss: you may be continuing to claim that, e.g. the subject/observer pattern in O'Caml requires casts or conversions, which has already been demonstrated to be false. That strikes me as unlikely to be the case, though, as you don't seem willfully obstinate. :-)

I'm unsure of the context

I'm unsure of the context here. If you mean doing I/O in, e.g. O'Caml and ensuring that the values brought into the runtime are of the types ascribed to them at the time of input

Yeah, pretty much.

Also, please do me a favor:

Also, please do me a favor: if you're going to write code to demonstrate counterexamples, please use a real language: it's spectacularly unlikely that your intuition as to what can be done in a language is sufficiently accurate for claims about your hypothetical examples to be correct unless you've implemented half a dozen languages or so—at the very least, you should have worked through "Lisp In Small Pieces" and "Types and Programming Languages" if you haven't yet implemented any languages yourself. You might also wish to learn O'Caml, at least, since your earlier comments about the subject/observer pattern that I posted were incorrect.

I'm confused. Are you saying that the code examples in my language are likely impossible, or that I don't know what's possible in other languages?

Broadly...

...both. I'm concerned because you've claimed that there's nothing about Python that would make type inference challenging, and you've totally misunderstood even the very simple O'Caml code I've posted. So when you post code in a language that doesn't even exist, all you're proving is that you can conceive of a syntax that you like. That doesn't mean that you can provide the syntax the semantics that you're claiming for it, any more than "Can God make a stone so big that even He can't move it?" tells us anything about large stones or God.

and you've totally

and you've totally misunderstood even the very simple O'Caml code I've posted.

What have I misunderstood?

So when you post code in a language that doesn't even exist, all you're proving is that you can conceive of a syntax that you like. That doesn't mean that you can provide the syntax the semantics that you're claiming for it, any more than "Can God make a stone so big that even He can't move it?" tells us anything about large stones or God.

This isn't some language I've conceived right here and now just for this discussion--I've been working on it and thinking about it for the past year. I'm fairly certain I know what's possible and what's not because I've already thought it through. If you don't believe it's possible, I'll gladly consider your argument, but I don't tend to be wrong on these types of things.

You might like to post

You might like to post semantics for the language somewhere then. When people post code in existing languages, we already know exactly what it means.

The semantics are in my

The semantics are in my description of any code posted and in the comparison between my code and his.

She means formal semantics.

She means formal semantics.

For certain values of formal

For certain values of formal - definition by interpreter works for me.

Driving Towards Specifics

Curtis W: What have I misunderstood?

A fair question. Your initial reaction to the subject/observer pattern was that it somehow "assumed the programmer was wrong" and "required checks to be inserted by the programmer." Neither of these observations were accurate. You then changed the discussion to type-safety around I/O, which we eventually stipulated is an issue, but it's still totally distinct from whether non-I/O code imposes any restrictions/constraints on the programmer.

Curtis W: This isn't some language I've conceived right here and now just for this discussion--I've been working on it and thinking about it for the past year. I'm fairly certain I know what's possible and what's not because I've already thought it through.

My point is precisely that absent extensive experience and/or formal training in designing programming languages, it's unbelievably unlikely that your language will actually prove implementable. Particularly when you talk about "hybrid static/dynamic systems," this is an area of open research into partial evaluation, dependent types, etc. People get PhD's in this domain, today. "Thinking it through" over the past year or so, particularly if you've never actually delivered a language before (I'm assuming, since you haven't commented on it) means quite literally nothing.

Curtis W: If you don't believe it's possible, I'll gladly consider your argument, but I don't tend to be wrong on these types of things.

Everyone tends to be wrong on these things. It's the study of type theory and programming language design, coupled with experience, that tends to reduce the error rate. Like Phillipa, I'll withhold further judgment until I see some kind of formal semantics.

A fair question. Your

A fair question. Your initial reaction to the subject/observer pattern was that it somehow "assumed the programmer was wrong" and "required checks to be inserted by the programmer." Neither of these observations were accurate.

That was in reference to static typing in general, not just your code. The latter is my fault; I was basically trying to say that static typing requires some sort of casting mechanism to be done by the programmer. If you kindly point to something you've said that I haven't refuted, I'll gladly do it now.

which we eventually stipulated is an issue, but it's still totally distinct from whether non-I/O code imposes any restrictions/constraints on the programmer.

Interaction with the outside world is exactly where static and dynamic typing differ and it's what I've been either talking about or referring to the entire time, although I've just realized that now.

Everyone tends to be wrong on these things. It's the study of type theory and programming language design, coupled with experience, that tends to reduce the error rate. Like Phillipa, I'll withhold further judgment until I see some kind of formal semantics.

What is it you need to know? I figured most of the code samples are pretty self-explanatory, especially since the type system isn't all that different from O'Caml's except for the dynamic typing and dependant types.

the type system isn't all

the type system isn't all that different from O'Caml's except for the dynamic typing and dependant types.

Dynamic dependent types must be about as different from OCaml's type system as you can get!


What they want is a full description of how your language works, preferably as formal semantics made up of rules but at least an accurate, unambiguous description of the meaning of your language's code.

Dynamic dependent types

Dynamic dependent types must be about as different from OCaml's type system as you can get!

Really, I was referring to the ability to have union types and type constructors, which aren't present in most languages. I'd say my type system is about as far away from python as it is from O'Caml.

What they want is a full description of how your language works, preferably as formal semantics made up of rules but at least an accurate, unambiguous description of the meaning of your language's code.

I know, I just don't think it's necessary for such a narrow topic as a type system.

It is, though. All too often

It is, though. All too often "just extend this type system with that feature" turns into a complete quagmire - you'd be amazed how much difficulty's been had with subtyping, for example.

I guess this is the closest

I guess this is the closest thing to an up-to-date spec. I do have a slightly old ebnf grammar for it, but I haven't touched it in at least a month or two.

Syntax != semantics

Syntax is not the same thing as semantics. A grammar tells us what a program should look like (syntax), but nothing about what the effect of executing that program will be (semantics). As Philippa has already mentioned, even an interpreter would give us some idea of what the semantics of the language are. Have you written an interpreter for this language?

The code example actual

The code example actual contains rather detailed explanations of what's going on. I could possibly get the grammar up to snuff. Maybe I should relabel that page as 'General Overview.'

OT: Large stones...

Just an aside, the stones question is just a way of asking whether something can be true and false at the same time...

To bring the question into a PL context, a better way to phrase the age old question:

Can Oleg find a single PL that combines all his insights (currently scattered amongst Scheme, ML and Haskell)?

By "easy", I meant "doable" :)

It's not going to be easy, but that's because you're moving something that's normally done by the programmer into the compiler/editor. To be honest, I'd rather have the development environment be more complex if it allows me to write more concise, flexible code.

Everyone here is probably in favor of a language that lets you write more concise and flexible code. But by "not easy", I meant "really, really hard". Hard to the point of being impractical.

The way I see it, highlighting and explicit casting have the same visual impact, the only difference being the the latter is done manually by the programmer whereas the former is done automatically by the editor.

I agree that they have the same visual impact, but as many others have said, an explict cast lets the programmer give the static checker some additional information.

Let's say you had an editor that highlighted type mismatches. In a language with explicit casts, the editor will only highlight a subset of the mismatches -- the subset that hasn't been explicitly OK'd by the programmer. In a language with no mechanism for the programmer to indicate "this mismatch is OK", the editor will highlight every single type mismatch; it has no way of knowing which ones you've already checked and approved of. I think you'd end up completely ignoring the editor warnings.

Throughout this discussion, I've been thinking of explicit casts in the context of unsafe downcasting in object oriented languages. For the example you brought up, it might be reasonable for there to be implicit conversions between the two types. For example, the 'type' keyword in Haskell actually allows the following:

type Name = [Char]
x :: Name
x = "abc"

Also, C# has a mechanism for the programmer to define implicit conversions.

Everyone here is probably

Everyone here is probably in favor of a language that lets you write more concise and flexible code. But by "not easy", I meant "really, really hard". Hard to the point of being impractical.

What makes you say that?

Let's say you had an editor that highlighted type mismatches. In a language with explicit casts, the editor will only highlight a subset of the mismatches -- the subset that hasn't been explicitly OK'd by the programmer. In a language with no mechanism for the programmer to indicate "this mismatch is OK", the editor will highlight every single type mismatch; it has no way of knowing which ones you've already checked and approved of. I think you'd end up completely ignoring the editor warnings.

Highlighting would only occur when casting would be required in other languages. For that statement to hold true, you'd end up with so many casts that you'd end up ignoring them in other languages. Maybe I haven't been clear enough on how it works--with a dynamically typed language, checking for type mismatches means checking for instances in which a value might not be valid for a given variable. This means that the example you posted above would not be highlighted, either.

Details

Maybe I haven't been clear enough on how it works

Yes, this is exactly why putting a more detailed writeup on your own site is a good idea.

Aye. Unfortunately, I'm more

Aye. Unfortunately, I'm more of a coder than a writer, so I guess that's to be expected.

Interpreter

Thats ok. If you can code up a small interpreter, then we can run your examples and see exactly what your talking.

Actually, I already have an

Actually, I already have an almost-complete interpreter, but I don't think it would be very helpful in explaining how type inference is used, considering my interpreter doesn't use type inference.

Going to Church

It would be good enough to show an interpreter of the explicitly typed language after type inferrence (i.e. in Church style). Of course, it would be more convincing if your examples had enough type annotations to pass your type checker so we won't have to guess how the inference works.

Of course, it would be more

Of course, it would be more convincing if your examples had enough type annotations to pass your type checker so we won't have to guess how the inference works.

Most of what I posted looks pretty obvious--do you have any specific examples in mind?

Beyond the primitives...

...what i'd prefer to see is something that doesn't revolve around pure primitives (ints, floats, strings, etc) and something along the lines of data abstractions (classes, modules, etc...). My thinking is that you quickly move away from the primitive types and into the user defined types rapidly.

I may not be the most reliable source for suggestions having spent way too much effort on this particular toy example, but perhaps a Shape example where you have a collection (list, array, etc...) of shapes (rectangle, circle, etc) that gets a type error when you attempt to add an object that is not derived from the shape class. So from a Python perspective, this should throw an error:

   scribble : Shape list = [Rectangle(10, 20, 5, 6), Circle(15, 25, 8), "NotAShape"]

From there, you could move out to more complex forms of type derivation and inference.

Well, your shape example

Well, your shape example would look something like this:

type Point = [number]::[move]
def point_move(point:Point, distance:[number](x => len(x)==len(point)):
    point[:] = zip(point, distance).map(([x,y] => x+y))

def create_point(point:Point):
    return point::[move=point_move]

data Rectangle:
    pos:Point
    width:number
    height:number

def rect_draw(rect):
    out.print(rect)

def create_rectangle(pos, width, height):
    return Rectangle(create_point(pos), width, height)::[
        draw = rect_draw
    ]

...

def draw_shapes(shapes):
    for shape in shapes:
        shape.draw()

draw_shapes([
    create_rectangle([0, 0], 10, 10)
])

draw_shapes([None]) # Error

I'm not really sure what this has to do with the discussion, though.

edit: Forgot to show use. Basically what happens is that the compiler can infer that shapes in draw_shapes has to be an iterable object which iterates over objects with a draw attribute. Since None doesn't have said attribute, it gets flagged at compile time.

Iterable object?

The discussion was about type inference and type checking. You say that we can infer that the type of the shapes collection has to implement an iterable interface, but that begs the question of what is the type. What if I have an unrelated class that also happens to implement the iterable interface. Well, I suppose we can then go out and find out that this item not only has to implement the iterable interface, we can also somehow infer that it needs a draw() method (or else it would get a doesNotImplement - or whatever that Smalltalk terminology is). So we've determined these two things, but we can still get an iterable class that has a draw method that is not a dervied from the shape class.

Of course, the Dynamic crowd will tell us that if it implements these methods then it effectively meets the type criteria (duck typing). But what would you call the type of shapes? And how do you go about restricting what is an allowable element within that collection?

Anyhow, this was not intended to be an exercise where you display untested code. This was just a simple starting example that might give a better reference point in determining how you infer the types of a collection (with the added bonus of having to contend with subtyping in the inference process), and how you would annotate and enforce the type for that collection when the user wants to limit it further than just a drawable and iterable item (maybe the programmer only wants rectangles).

The discussion was about

The discussion was about type inference and type checking. You say that we can infer that the type of the shapes collection has to implement an iterable interface, but that begs the question of what is the type.

The inferred type is [any::[draw]], which says, in english, "an object which can iterate over any objects with a draw method."

What if I have an unrelated class that also happens to implement the iterable interface.

That's fine, so long as it iterates over objects of type draw. This means that, for example, that the objects might not even be on the same machine that the program is running on--it could be retrieving them from a server.

So we've determined these two things, but we can still get an iterable class that has a draw method that is not a dervied from the shape class.

Ah, but you see, having a 'draw' method is being derived from the Shape class. If you actually wanted to define a Shape it would look something like this:

type Shape = any::[draw]

Which means exactly what it looks like--any object with a 'draw' method. There is no inheritance in my language; if you want an interface, use something like the above code, if you want reuse, just tack the functions you want to reuse onto the new class.

Of course, the Dynamic crowd will tell us that if it implements these methods then it effectively meets the type criteria (duck typing).

That's because it does meet the criteria.

And how do you go about restricting what is an allowable element within that collection?

If you want to artificially restrict it, just assign the variable a type.

(with the added bonus of having to contend with subtyping in the inference process)

Structural subtyping is the basis with which inference operates in my language. I hardly see how this is "contending."

Have you dealt with variance

Have you dealt with variance issues yet? Can you correctly type mutable and immutable lists parameterised on the item type?

I've no idea what variance

I've no idea what variance issues are :) I'm also not completely sure what you mean by "correctly type...lists parameterised on the item type." If you mean can I represent a list type in the type system, then yes:

type sequence(element_type) = any::[
    get:function(any)->element_type,
    set:function(any, element_type)->None
]

I'm still working on the syntax for function types since I haven't really thought about them much.