Lambda the Ultimate

The Case for First Class Messages
started 5/14/2004; 6:02:30 AM - last post 6/17/2004; 12:15:41 AM

Chris Rathman - The Case for First Class Messages

5/14/2004; 6:02:30 AM (reads: 13717, responses: 168)

The Case for First Class Messages

Touches upon a number of langage design issues, though I think the case it tries to make for Message Oriented Programming needs work.

One of the major challenges in the design and implementation of a programming language is the consistent definition and treatment of the major concepts in the language. First class abstractions in a language are usually considered the mark of clean language design. One can often notice the treatment of concepts as second class when there are lots of restrictions on the definition of sub concepts or their use. A language concept is called "first class" when it can be used freely in the programs in all contexts in which this would be reasonable.

Functional languages, for example, are defined by their support for first class functions. Thus in functional languages it is possible to write functions that create other functions; functions can be assigned to variables; functions can be passed as parameters to other functions and finally functions can return functions as results.
Posted to OOP by Chris Rathman on 5/14/04; 6:03:24 AM

Frank Atanassow - Re: The Case for First Class Messages

5/14/2004; 6:57:00 AM (reads: 2955, responses: 0)

First class abstractions in a language are usually considered the mark of clean language design.

I don't really buy this. Although all the major language features (tuples/records, choices/variants and functions) are instances of second-class things promoted to first class, sometimes the value of such promotions is questionable, or even makes the language worse.

For an example of the former consider continuations. Making continuations first class is useful but it also forces a language to adopt call-by-value or call-by-name semantics, so it's not a 100% win, and it's not a conservative extension. Linear (one-shot) continuations may be a different matter; I don't know enough about them.

For an example of the latter consider first-class types (meta-classes) as in Java. Part of the point of these is that they support introspection, which is ostensibly useful. But they also break all the encapsulation guarantees which the rest of the language purports to provide.

I think one mark of clean language design is an elegant (usually this means nicely factored/orthogonal) and strong meta-theory, that is, simple and powerful reasoning principles for programs. Sometimes making things first class can improve the meta-theory, but sometimes it can make it weaker and more complicated.

(I haven't read the article yet, so I'm only taking exception to that one sentence.)

Patrick Logan - Re: The Case for First Class Messages

5/14/2004; 10:08:43 AM (reads: 2951, responses: 0)

But they also break all the encapsulation guarantees which the rest of the language purports to provide.

Some thoughts as I shoot from the hip (as usual)...

Encapsulation is not all it's cracked up to be.
Java's Class objects are hardly "first class" (even in the loose definition of that term).

one mark of clean language design is an elegant (usually this means nicely factored/orthogonal) and strong meta-theory, that is, simple and powerful reasoning principles for programs.

I have nothing against strong theory, I am all for it. However, theory typically follows practice, whether it is continuations, "objects" (note the quotes), reflection, or generally "side effects".

Specifically Smalltalk and CLOS have demostrated a successful practice of "first class" (again, note the quotes) messages. Maybe a theory should or could put this into a clean, elegant context as existing theories have been doing for other effects listed above. Meanwhile, the practice is still useful, perhaps "elegant", if ad hoc.

I'm not going to say therefore that "first class" anything implies a "clean" design. Yet a combination of simple ingredients, one of them being "first classness" for most things, appears to have led to *utility* in several similar ad hoc languages, e.g. Smalltalk, Lisp, Python, Ruby.

nickmain - Re: The Case for First Class Messages

5/14/2004; 10:18:20 AM (reads: 2900, responses: 2)

> But they also break all the encapsulation guarantees which the rest of the language purports to provide.

This is not necessarily true. Java introspection does not allow you to access or invoke non-public members.

It is also very clear when code is using instrospection since it involves calls to methods of classes in the java.lang.reflect package. It doesn't change the semantics of the language "syntax" level (iykwim).

On the other hand, the Class-loader capabilities, coupled with a bytecode manipulation library, lets you subvert the semantics at the lowest level...

Daniel Yokomizo - Re: The Case for First Class Messages

5/14/2004; 10:35:28 AM (reads: 2909, responses: 1)

This is not necessarily true. Java introspection does not allow you to access or invoke non-public members.

This piece of code runs without throwing exceptions:


import java.lang.reflect.Field;
public class Foo {
    private String bar;
    public Foo(String bar) {
        this.bar = bar;
    }
    public String toString() {
        return this.bar;
    }
    public static void main(String[] args) throws Exception {
        Foo foo = new Foo("Hello World!");
        System.out.println(foo);
        Field barField = foo.getClass().getDeclaredField("bar");
        barField.setAccessible(true);
        barField.set(foo, "Goodbye World!");
        System.out.println(foo);
    }
}

Luke Gorrie - Re: The Case for First Class Messages

5/14/2004; 1:23:27 PM (reads: 2842, responses: 1)

Patrick writes: Encapsulation is not all it's cracked up to be.

Even supposing it is, it would be nuts to ignore the nifty things you can do with encapsulation-breaking introspection features.

An obvious example is writing a debugger. A common "encapsulation respecting" way is to write it as hairy under-the-hood runtime system extension. A more fun way is to expose some encapsulation-breaking primitives for inspecting the stack and pulling apart objects, then just writing the debugger as a regular program. Profilers are another example.

Lisp hackers write some really nice development tools using tricks like inspecting the stack, iterating through the whole heap, dismantling arbitrary objects, temporarily replacing function definitions, and so on. To ignore all of this on software-engineering principle would be to miss the party!

Matthew - Re: The Case for First Class Messages

5/14/2004; 1:52:38 PM (reads: 2823, responses: 1)

Can someone define 'First-class' for me, in this context? I seem to see it everywhere, and have a vague idea of what it means (constructs supported on a language level/as built-in types), but as somewhat of a self-taught computer scientist have never seen the definition, and suspect there's more to the term than just that...

Also what is Second Class? supported only through libraries?

Daniel Yokomizo - Re: The Case for First Class Messages

5/14/2004; 2:39:02 PM (reads: 2820, responses: 0)

First class: you can use it as a parameter, returning it from a function and bind a name to it (e.g. store it on a variable).

Second class: can't do some or all of the above. For example in classic Pascal we can't return a function from another function, but we can take a function as a parameter to another function.

Bryn Keller - Re: The Case for First Class Messages

5/14/2004; 4:40:31 PM (reads: 2801, responses: 0)

Patrick: Encapsulation is not all it's cracked up to be.

Luke: Even supposing it is, it would be nuts to ignore the nifty things you can do with encapsulation-breaking introspection features.

I've written lots of Python code that loads functions dynamically and stuffs them into objects as singleton methods, or builds classes at runtime or walks the stack looking for interesting information, and it's true that I do miss that flexibility when I'm working in statically typed languages. Some folks say the answer is an optional type system, where you can supply types and have the compiler check them, or omit them and then the runtime system will have to do the checking. But I don't like this idea much. As LTU readers probably know, I like static type checking a lot, and I'm not convinced making it optional helps anybody.

However, I sometimes feel that systems that emphasize static typing are more static than they need to be. Basically, I'd like to keep the safety of static typing, but without giving up dynamic program behavior. I'd like to be able to load plugins, say, or update running code, in my Haskell program. I don't need ultimate dynamicity in every aspect of my program, but I would like to be able to make graceful transitions from one known (statically checked) program state to another.

So for example, I run the compiler and it checks my types, generates an executable. I start the program, it runs for a while, and at some point it dynamically loads or reloads some code. The type checker automatically runs again. If it doesn't like the new code, it rolls the program back to its old state. If the new code is acceptable, then the modified program continues on its merry way, and I get to know that the modified program has every bit as much type safety as the original. Then I could have my eval, and eat it too.

I suppose someone will tell me that there's a version of the Haskell web server that supports plugins, or that the Ocaml or SML toplevel does things like this, but I don't think they make all the safety guarantees I'd like when loading new code, and it doesn't seem like it's going "with the grain" of the language to use these things for production systems anyway. Maybe I'm wrong.

Maybe something like MetaOcaml could help with this problem someday... Or there may already be a solution somewhere I'm just not aware of. Anyone?

Toby Reyelts - Re: The Case for First Class Messages

5/14/2004; 5:11:33 PM (reads: 2789, responses: 0)

This piece of code runs without throwing exceptions:

That's because you're running under a SecurityManager that allows you that access. If you run that same program under a tighter SecurityManager (say the one installed by your Applet class loader), you'll get an exception.

This means that, yes, you can break encapsulation, but only when it's safe - i.e. explicltly allowed.

Mark Evans - Re: The Case for First Class Messages

5/14/2004; 5:13:08 PM (reads: 2777, responses: 0)

For reference, from a previous LtU post on Sina:

The behaviour of an object [in Sina] can be modified and enhanced through the manipulation of incoming and outgoing of messages.

Mark Evans - Re: The Case for First Class Messages

5/14/2004; 8:19:10 PM (reads: 2753, responses: 0)

Bryn: Or there may already be a solution somewhere I'm just not aware of. Anyone?

You'll find what you need in Alice ML with its limited dynamic typing extensions to ML.

Chris Double - Re: The Case for First Class Messages

5/15/2004; 7:04:17 PM (reads: 2653, responses: 0)

Also the functional language Clean (http://www.cs.kun.nl/~clean/) has a Dynamic Typing extension.

Frank Atanassow - Why Introspection Breaks Encapsulation

5/16/2004; 7:04:59 AM (reads: 2630, responses: 0)

First, let me address this comment and then I'll treat the rest.

Toby: This means that, yes, you can break encapsulation, but only when it's safe - i.e. explicltly allowed.

My definition of `encapsulation' is more rigorous, and no system of introspection types can respect it. (Actually, instanceof doesn't respect it either, but the problem is not as severe.)

Here is the problem. Suppose I have types A and B, with B <: A. In Java they would be interfaces. The meaning of subtyping is merely that the language permits a B anywhere an A is required, but the way subtyping is used in practice requires roughly that B behaves like A (the LSP, Liskov Subsititution Principle) in such a way that we can regard B as an implementation of A. The point of this, of course, is that I can switch out different implementations of A by supplying different objects which belong to a subtype of it; that is, the point of this is encapsulation.

In order for this to work, it has to be the case that an A-context (any code which uses an A) cannot depend on the particular subtype of A which is supplied it, because then if I replaced the implementation with another, it would break. (Well, not quite.) But this is precisely what introspection allows. Notice (nickmain) that even being able to determine the existence even of a public method is problematic.

(instanceof is not as bad because you can only really use it when you can enumerate the subtypes of a type, that is, when there is a closed-world assumption in effect. In this case, you can write code that handles all the cases. But, as everyone knows, instaceof sucks and should be avoided regardless.)

Now here is, I think, the real reason that first-class types are dangerous: the `best' notion of equivalence for values is (beta-eta-)equality (in contrast to syntactic equality), while the `bestt' notion of equivalence for types is one step weaker, namely isomorphism. The reason is that any type B isomorphic to a type A can serve just as well in any A-context, provided you insert a coercion (which in OO languages is effectively determined dynamically by dispatch), and vice versa. This is analagous to the fact, at the value level, it does not affect the behavior of a program whether I write 2+2 or 4, since they are equal. (Note, BTW, that LISP-style macros can distinguish these two, which is one reason I object to them. :)

Well, maybe it's confusing for you to hear about `equality' and `isomorphy' if you are not used to equational languages. The asymmetric analogues to this in non-equational languges (such as all OO ones I know of) are `convertibility' (the reduction relation) and `behavioral subtyping'. For values, `x' behaves like `y' if `x' converts (reduces) to `y'. Similarly, a type B behaves like A if B is a behavioral subtype of A, which I'll write B << A.

So, suppose now that I have an operator type, which, given a type A, produces a value type(A) that represents the type A as a value. In order to respect encapsulation in the presence of first-class, then, one needs to ensure at least that, for all A, B:

B << A implies type(B) => type(A) (*)

where << means `is a behavioral subtype of' and => means `reduces to'.

Of course, Java doesn't respect this; indeed, it can't respect this since it doesn't even respect the LSP, since it can't decide the behavioral subtyping property! And if Java did implement this rule then Luke's example of a (conventional) debugger would not be implementable using introspection, since you wouldn't be able to access the methods or fields which are hidden by upcasting, that is, the ones which are only accessible if you know the `dynamic' type of an object, that is, the ones belonging to the implementation..

Now, you might argue, `well, if we use subtyping as a weak replacement for behavioral subtyping, putting the burden on the users to respect the LSP, we might as well add the above as a reduction rule to Java.' But in my opinion this is a bridge too far. It would be OK if people actually wrote programs in such a way that they didn't depend on subtyping modeling behavioral subtyping, but they don't, and I'm sure OO people would find that unacceptable.

You might also argue, `well, if we use subtyping as a weak replacement for behavioral subtyping, putting the burden on the users to respect the LSP, we might as well put the burden on the users to respect (*) as well.' But writing programs which respect the LSP without the compiler double-checking it for you is hard enough; writing introspective programs which respect the LSP and (*) is even harder...

Incidentally, a calculus I'm developing does enforce/respect behavioral subtyping, for, admittedly, a very weak notion of `behavior'; furthermore, it does not identify inheritance with subtyping, or force types to have a single supertype. Adding first-class types there might be interesting, and I spent this morning thinking about how to do it, and what uses there might be for it; mainly I'm thinking of extensional polymorphism like in G'Caml, lightweight generics and dependent sums (dynamics). I can imagine a natural categorical semantics for it... so perhaps first-class types are interesting after all.

Frank Atanassow - Re: The Case for First Class Messages

5/16/2004; 7:06:08 AM (reads: 2622, responses: 0)

Patrick: Encapsulation is not all it's cracked up to be.

If you have to throw out such a basic and uncontroversial principle as encapsulation to advocate such an experimental feature as first-class types, I would say your argument is very suspect.

I also wonder if you realize what a double-edged blade it is. Anyone who is willing to promote introspection because they feel encapsulation is not-so-important must also be prepared to accept criticism of nearly every other language feature, such as higher-order functions and, perhaps most notably in your case, OOP, since the worth of these things is largely that they support encapsulation.

In brief, I think you're throwing the baby out with the bathwater.

Java's Class objects are hardly "first class" (even in the loose definition of that term).

My definition of `first-class' (which I assumed was shared by everyone here) is `can be passed to and/or returned from a procedure'. Surely that admits Class objects.

theory typically follows practice

More like `theory that follows practice typically follows practice'. There is a huge world of theory out there which, I suspect, you are unaware of.

Bit of a tangent...

I can identify two schools of theoretical PLT, which I'll call the `archeological' and `engineering' schools.

The archeologists are interested in picking through existing and older languages, and finding nice ways to organize and explain what is found there. This type of research typically involves finding a semantics for a language, or figuring out if a type system is sound. Occasionally this produces immediately useful results (like fixing the bugs in Java's class-loading system, or the problems with variance in Eiffel's type system), but in my view the main worth of language archeology is that it identifies problems and issues that can only be adequately addressed in future, yet-to-be-designed languages.

The `engineering' school comprises people who try to use theory (which may have originated from archeologists) to design new features and languages. This is the crowd that invented things like algebraic datatypes, type classes, implicit parameters, monad syntax and so on.

I am in the subset of this crowd who believes good languages don't arise by accident, but rather are designed to be that way. This is an idea which maybe goes back to Dijkstra and the structured programming school. Dijkstra believed that good programs are structured, and that it's impracticable to try to extract latent structure, that, instead, it ought to be explicit. Thus they introduced structured programming constructs like while to replace goto's, making the structure explicit rather than implicit. (To my mind, static typing is just an extension of this idea, c.f., `latent' typing. This is also why I'm not very interested in using external theorem provers to improve program reliability.)

In a way, I have to admit I think of archeology as pessimistic and engineering as optimistic, because one rarely gets an archeological result like, "hey, you know what, this feature was actually really great!' :) An exception might be Algol 60, which people like Reynolds (in `The Essence of Algol') and Hoare laud for its elegance. Hoare said, Algol 60 is

a language so far ahead of its time that it was not only an improvement on its predecessors but also on nearly all its successors.

By which he means, I think, most notably Algol 68 and the C family.

Yet a combination of simple ingredients, one of them being "first classness" for most things, appears to have led to *utility* in several similar ad hoc languages, e.g. Smalltalk, Lisp, Python, Ruby.

I would think you would know me better by now than to try to convince me of the worth of feature by pointing to this family of languages. :)

Of course it is very easy in untyped programming languages to make everything in sight first-class, because untypedness (= single-typedness) smooshes everything together into one big ball of mud, which the poor programmer is left to pick apart at run-time. The problem is, what is the best way to pick it apart?

And that is where untyped languages always punt. They might get the introduction rules right, but never the elimination rules. Let me suggest that that's why so many arguably useful ideas and constructs like referential transparency, pattern-matching, folds, monads, even OOP originate in typed languages.

It reminds me of the XML community. They think that by tagging everything in sight now, they are `adding value' that sometime in the future they'll be able to exploit. But that is always infinitely postponed; the only tags that actually get exploited are the ones that are actually assigned an unambiguous semantics now, that is, the ones that have elimination rules.

I think the untyped community is caught in a cycle of infinite regress. Your remark on encapsulation exemplifies this. `Encapsulation: we want it, we don't want it, we want it, we don't want it, ...'

nickmain: This is not necessarily true. Java introspection does not allow you to access or invoke non-public members.

This is irrelevant. See my post above.

Luke: Even supposing it [encapsulation is all it's cracked up to be], it would be nuts to ignore the nifty things you can do with encapsulation-breaking introspection features.

An obvious example is writing a debugger... Profilers are another example.

It is telling that you use these two particular examples, because they are prototypical examples of metaprograms, and have a very narrow domain of application. The vast majority of applications benefit from encapsulation and, in your words, it would be nuts to ignore the nifty things you can do with it! (Even a debugger and profiler only break encapsulation in a few select places.)

But, you're right that there is a time and a place for breaking encapsulations and, indeed, every method of encapsulation needs such a time and place, because for every interface/type/module/whatever you need to supply an implementation somehow.

What I object to, rather, is that the language provides no method of enforcing encapsulation when you want it, that is, Java introspection is half-baked. It is possible, I think, to support both up-to-equality and up-to-iso/coercion typing. The first would allow Java-like introspection and break encapsulation; the second would use the law I suggested above and respect encapsulation.

(What I object to even more strenuously is that language designers who adopt features like introspection do not make clear that those features break encapsulation. And what is the upshot? Things like Joel Spolsky's leaky abstractions article!)

BTW, I made an error in my first post, when I said that introspection breaks `all the encapsulation guarantees which the rest of the language purports to provide.' Here I fell into the trap of thinking that Java subtyping is actually behavioral, so that Java does actually manage to provide encapsulation via subtyping. But this is false, so actually introspection can't break what isn't there in the first place. Consequently, it's not the case that supporting introspection in Java breaks any guarantees that people who don't use it might otherwise have.

Luke Gorrie - Re: The Case for First Class Messages

5/16/2004; 4:08:30 PM (reads: 2572, responses: 0)

Frank, I'm glad you think my examples of profiler and debugger were appropriately chosen. Certainly not all programs need to do things like these do, but I can list plenty more useful ones if you'd like.

There seems to be an important distinction between "supporting" and "enforcing" encapsulation. I'm sure I've seen this discussed before, but I don't recall where. Like most people in this enlightened age I use a lot of encapsulation. However, I don't feel threatened by the possibility of myself and the friends I program with being able to deliberately break through this encapsulation if we want to.

Personally I think that encapsulation and abstraction are great things. I want my programming language to make them easy to respect, but why is it important to make it difficult or impossible to bypass deliberately?

Patrick Logan - Re: The Case for First Class Messages

5/16/2004; 8:49:32 PM (reads: 2558, responses: 0)

There is a huge world of theory out there which, I suspect, you are unaware of.

No doubt.

It is telling that you use these two particular examples [debuggers and profilers], because they are prototypical examples of metaprograms, and have a very narrow domain of application.

Each one of these examples may be considered a narrow domain of application. But you add them up and result is fairly broad, well beyond the typical development tools domain.

I am glad this school of thought exists. I believe this approach is being motivated by the desire to ground practices from more ad hoc systems in more formal notations, without losing the expressiveness of the ad hoc notations. That is noble, and when I find the results as satisfying as, say, Smalltalk, I intend to use them, because then they will certainly be better.

Frank Atanassow - Re: The Case for First Class Messages

5/17/2004; 9:55:50 AM (reads: 2477, responses: 1)

Luke: Personally I think that encapsulation and abstraction are great things. I want my programming language to make them easy to respect, but why is it important to make it difficult or impossible to bypass deliberately?

For reasons I'm sure you well know: an encapsulated/abstract data structure has a different semantics from a plain one. If T implements S, then by definition T may satisfy properties in addition to the ones S satisfies. A client of T is allowed to depend on those additional properties, but a client of S is not. So roughly the set of clients of S is a subset of the clients of T.

Let me anticipate your arguments. This is not a question of flexibility, because `you can write more programs using T than S'. And it's not a question of trying to force people using your code to program defensively or reliably or whatever. And it's not a question of imposing your own opinions or values on your user community.

It's just a question of semantics. When, in your documentation or interface, say, you write `S' instead of `T' you are exactly saying to all the people who are going to use your code that `you can rely on this certain set of properties, but no more; if you rely on more, your program may not function the way you expect.' Indeed, T may not even `exist' at that point, or if it does perhaps you are planning to change its behavior in the future, or you may not have made a decision yet as to how to implement S.

Suppose you tell your friend "Hey, I'll see you next week," and then next Tuesday you meet him and he comes up to you and says, "Hey man, you promised to see me on Monday!" Of course you would reply, "No, I didn't; I said I'd see you this week, and I have. Are you deaf or just stupid?" And if he then says, "I know you said `next week' but when you say `next week' you always mean `Monday'," naturally you would say, "Motherfucker, if I had meant Monday I would have said `Monday', not `next week'." And if he said, "Well, why didn't you say Tuesday?", you would say, "Why the hell do I have to explain myself to you? Maybe I hadn't decided yet, or I thought I might have other commitments on Tuesday but ended up being free, or I like to lead a happy-go-lucky existence and see you at my whim whenever the hell I goddamn please (within this week, at least)."

Luke Gorrie - Re: The Case for First Class Messages

5/17/2004; 9:58:59 AM (reads: 2475, responses: 0)

Thanks for clearing that up. :-)

Frank Atanassow - Re: The Case for First Class Messages

5/17/2004; 10:41:26 AM (reads: 2480, responses: 0)

Luke: Frank, I'm glad you think my examples of profiler and debugger were appropriately chosen. Certainly not all programs need to do things like these do, but I can list plenty more useful ones if you'd like.

and

Patrick: Each one of these examples may be considered a narrow domain of application. But you add them up and result is fairly broad, well beyond the typical development tools domain.

I'm sure you can give other examples, and I suspect that they all or most are metaprograms of some sort. Let me explain by analogy why I don't consider them so important.

I'm sure you have used the C preprocessor and other preprocessing programs, and macro systems. And I'm sure you're aware of the problems with them, for example, the difficulty of tracing errors back to their source, even when you have things like #line to help, or the difficulty that occurs when you try to use multiple preprocessors (or macros), which don't know about each other, together.

One role for which the C preprocessor was indispensible was for duplicating function and class definitions when they differed only in the types involved. In C++, the template mechanism helps to obviate that use. In ML, polymorphism plays a similar role. Besides being neat-o keen, these features improve on the preprocessor in that the compiler can produce good error messages. (Well, at least an ML compiler can. :)

The reason this is possible when they are language features, but not via a preprocessor, is that really preprocessed programs form a language of their own, and that a preprocessor needs to be a compiler in itself. That is, it needs to know more than simply the lexical syntax of C, it needs to know about the C grammar, and even the C semantics, to do it's job in a transparent fashion. In other words, it's kind of a hack. LISP macros are better because they know not only about LISP lexical syntax, but also its grammar; but they don't know about LISP semantics, so they are still troublesome. Scheme hygienic macros are even better because they know the lexical syntax, the grammar and some of the semantics, namely variable binding. But still they can be troublesome.

To solve all the problems, you really need to modify the compiler itself, and not just preprocess the input. Obviously, that's a lot of work. You can, of course, ignore the problems; but then the user has to know about how the preprocessor works in order to make sense of the problems he encounters. The whole point of a preprocessor is to hide the intermediate stage, but that is what the user has to look at to understand any problems.

The crucial thing about ML-style polymorphism, though, is not that the compiler knows about it. The crucial thing is that it can be specified in a way that does not depend on lexical syntax or grammar, so the compiler doesn't need to expose any of its internals; it can translate any errors it encounters back into a language understandable by the user. Also, you can understand that style of polymorphism in any language, even though it may have a completely different syntax from ML (like, say, Haskell). So ML-style polymorphism solves one of the same problems as the C preprocessor, but does it in a syntax-independent fashion. (Of course, it has to have some syntax for the user to be able to use it, but the point is that the particular syntax doesn't matter.)

Now replace `syntax' with `implementation' and `semantics' with `specification' (or `interface' or`contract').

What I want to suggest is that, just as a language feature like polymorphism can subsume at least one role played by a preprocessor, so also can the roles of metaprograms like debuggers and profilers be played in some respects by programs (not metaprograms) or modules or whatever in a sufficiently expressive language. Now, of course, without introspection there would be no way to get at the name, say, of a method or variable or whatever. But the purpose of a debugger, ultimately, is not to print out the names of variables in the source; it's to help debug the source, right? Printing names is only a means to an end.

Well, I cannot say what such a program would look like (though, look at Hood), except that it would be a sort of abstract debugger. And I'm not saying that every useful metaprogram can be completely subsumed by a program. But I am saying that, just as ML polymorphism is more useful than doing the same thing with a preprocessor, it is usually better to write programs than metaprograms, when possible, and that the better and more powerful a language is, the less the need for metaprogramming.

Peter William Lount - Re: The Case for First Class Messages

5/18/2004; 3:34:01 AM (reads: 2383, responses: 0)

Message Oriented Programming is an important aspect of the next generation of advanced object oriented computer languages, as is "Meta Oriented Programming".

Smalltalk "messages" are first class objects, however, like stack frames, it's more efficient to not have most messages as objects as this slows down the virtual machine's message processing performance dramatically. To compensate for this vm implmentation choice Smalltalk enables you to access messages as first class objects when needed. You can also create messages as first class objects just like you would create any other object.

In Smalltalk you'll find messages as objects if you halt a process and debug it. This is how the debugger works with messages. In addition it's easy to create message object instances and then send them.

Most of the Meta data in Smalltalk are first class objects: classes, block contexts (closures), methods, messages, etc... All of the meta data that are first class objects are used and accessed via the same message sending syntax that you access any objects with. This important aspect of the "elegant design" impact of "first class objects" enables "meta data" manipulation just the same as any other object in the system. This is one of the goals in Smalltalk, using the simple message sending syntax for all meta data. In this way all meta data in Smalltalk is expressed in same syntax as regular non-meta data.

Now that processors have billions of cycles available it might be wise to revisit this vm design choice to find if there are other capabilities that would be more valuable than simple raw performance of message sends. As we are learning execution time isn't the only performance measure that matters. Human effort, comprehension and development time and costs are becoming the most relevant performance measures that developers face in the real world.

What is lacking in the current elegant Smalltalk syntax is a concise manner to express on the fly "Message Meta Data" in message expressions themselves rather than having to step out of "context" and use the combersome and verbose approach of constructing messages from objects. In addition it's advantagous to have the ability to do this while retaining the use of Smalltalk message sending syntax for this "message meta data".

I've achieved this "inline message meta enhancement" in defining Zoku, a variant of the Smalltalk language that also owes much to LISP, Self, and a host of other languages. As work progresses more details will be released.

The key point is that access to the underlying message sending system and to any message object in the system can be achieved using either existing Smalltalk syntax or the simple Zoku extensions. This creates a complete language that enables both objects and messages to be expressed fully on a first class and equal basis (inline in the grammar) while using and keeping the beauty and elegance of the original Smalltalk grammar.

We feel that this is a step into the future even though we've not yet fully explored all of the implications that this new "meta message" syntax makes possible. A door to new possibilites is opened. The key now is unlocking and realizing the potential.

Please see upcoming articles on dev.Smalltalk.org for more information on "Message, Object and Meta Oriented Programming and Zoku".

Patrick Logan - Re: The Case for First Class Messages

5/18/2004; 7:20:50 AM (reads: 2374, responses: 0)

I'm not saying that every useful metaprogram can be completely subsumed by a program. But I am saying that, just as ML polymorphism is more useful than doing the same thing with a preprocessor, it is usually better to write programs than metaprograms, when possible, and that the better and more powerful a language is, the less the need for metaprogramming.

I am not sure where the line is with Smalltalk. What is programming vs. metaprogramming in Smalltalk? I don't know. It's Smalltalk all the way down for Smalltalk-80 based systems, until you get to a fairly thin layer of magic that varies by implementation.

Regardless, I think the quote above illustrates my point that the useful theory work I see today in PL design seems to be significantly motivated by the more ad hoc practices of so called dynamic languages. Modern functional languages are becoming more expressive, approaching the better less formal languages, yet they are bringing the formality with them, which is a good thing. But as stated above, they're not all the way there yet.

Chris Rathman - Re: The Case for First Class Messages

5/18/2004; 11:57:21 AM (reads: 2339, responses: 0)

<blockuote>But as stated above, they're not all the way there yet.In certain senses, we are already there. We can statically express functions quite well. Being a pragmatist, I can't help but think that incrementally approaching typing is the way to go. I'd like is a programming language that allows me to statically type units but allow those units to message across that boundary using messages (synchronous and asynchronous) with much looser coupling.

Kind of like Objective-C wraps C into a Smalltalk message framework, I'd like to be able to wrap ML typed functions into objects. (O'Caml probably does this already).

Peter William Lount - Re: The Case for First Class Messages

5/18/2004; 11:59:40 AM (reads: 2334, responses: 0)

Patrick Logan wrote: "I am not sure where the line is with Smalltalk. What is programming vs. metaprogramming in Smalltalk? I don't know. It's Smalltalk all the way down for Smalltalk-80 based systems, until you get to a fairly thin layer of magic that varies by implementation."

If you are used to special and distinct "meta" grammar and syntax in the languages you are using I can see why you'd be confused where the line is. The line is very clear in Smalltalk.

I address Patrick's question in depth here.

Avi Bryant - Re: The Case for First Class Messages

5/18/2004; 10:33:22 PM (reads: 2318, responses: 2)

Peter, Patrick's question doesn't indicate a lack of knowledge about Smalltalk, it's just that most of us don't see Smalltalk the same way you do. I'm with Patrick: the line is *not* clear. In fact, I tend to think of there not being a metalevel at all, but simply a deeper user-level model that encompasses a greater range of things than most languages (including compilation, method tables, and stack frames but not including garbage collection, object headers or method dispatch), and then a hard barrier with the (virtual) machine. It's rather like assembler in this way: there's just memory, registers and instructions (over which you have total control), and a machine underneath (over which you have no control). What's the metalevel for machine language?

As Patrick mentions, that's only necessarily true for Smalltalk-80 derivations, and not for other languages (like #Smalltalk) that follow the ANSI Smalltalk standard but have very different implementations. But to me, Smalltalk-80 is the point; the language as a specification is wholly uninteresting.

andrew cooke - Re: The Case for First Class Messages

5/19/2004; 6:52:46 AM (reads: 2294, responses: 1)

In fact, I tend to think of there not being a metalevel at all, but simply a deeper user-level model that encompasses a greater range of things than most languages

i've been wondering exactly what Meta Programming means recently - there seem to be two different approaches (at least).

one is typically macro based - you process source "before compilation". the other is typically connected with late binding in bootstrapped languages - you rebind the functions associated with evaluation and so modify the "run-time" behaviour.

in some sense they're equivalent - it's just a case of where you do the indirection (isn't it always? ;o) - but the implementation is quite different.

does that make sense? am i stating the obvious? :o)

(i'm working on a tiny language that has almost nothing except the ability to rewrite lists. but since it can be bootstrapped and names rebound it's surprisingly flexible. i can add crude support for time-slice multithreading in 60 lines of code!)

andrew cooke - Re: The Case for First Class Messages

5/19/2004; 7:07:04 AM (reads: 2300, responses: 0)

also, in late-binding/bootstrapped meta programming (which is what i think smalltalk is, although i've never used it), is there any problem with russel's paradox (since you're using the same language to modify the evaluation of that language)? does dynamic scope take care of this (giving you contexts that impose some kind of ordering of languages)? in my little language i'm quite careful about separating different levels (like tarski), but it's not completely clear how necessary it is (i hate to think what the types would be if there were no quoting, but since it's "dynamically typed" i don't care). yes, i need to read about staged languages or whatever they're called - but i guess i'm asking if that approach is needed if you don't have a type system?

[edit - russell's paradox was a bad choice because it's going to send people off on set theory, which is a bit of a red herring. "this sentence is false" is a better example.]

Frank Atanassow - A Double Standard

5/19/2004; 8:41:24 AM (reads: 2280, responses: 2)

Patrick: Regardless, I think the quote above illustrates my point that the useful theory work I see today in PL design seems to be significantly motivated by the more ad hoc practices of so called dynamic languages. Modern functional languages are becoming more expressive, approaching the better less formal languages, yet they are bringing the formality with them, which is a good thing. But as stated above, they're not all the way there yet.

What a double standard!

As I've shown before, and have repeatedly pointed out, by defining a universal type, you can do anything in a statically typed language that you can do in an untyped language, including run-time reflection, dynamic loading, introspection, OO-style objects and dispatch, etc.

What you cannot do, except in certain experimental languages like Meta(ML/Ocaml), is get the same static guarantees for those features that you get for the remainder of the language. But this is no worse than in an untyped language, since there one gets no static guarantees at all!

When we say, for example, that "Haskell does not support eval but Scheme does," we are comparing apples and oranges. What we're really saying is, "Both Haskell and Scheme support untyped eval but neither support typed eval." Clearly, when you look at it this way, Haskell is much closer to supporting typed eval than Scheme, since at least it supports static typing for the part of the language which does not involve eval, whereas Scheme does not. (And languages like LISP only support monomorphic type annotations at best.)

Now, of course, mixing typed and untyped code in Haskell is not as easy as we would like it to be. But it is possible (as I also showed). But mixing typed and untyped code in Scheme is impossible, since Scheme has no notion of static typing in the first place!

People talk about delaying things until run-time as if it were an advantage. But it is clearly always easier to do things at run-time than to do the same things statically. It actually lowers the bar. And anything done dynamically in an untyped functional language can be done dynamically in a typed functional language, because of Turing-completeness.

So statically typed functional languages long ago transcended their untyped functional counterparts. It is untyped languages that are playing the game of catchup!

Patrick Logan - Re: The Case for First Class Messages

5/19/2004; 9:30:05 AM (reads: 2269, responses: 0)

What a punt!

8^)

Manuel Simoni - Re: The Case for First Class Messages

5/19/2004; 10:18:33 AM (reads: 2271, responses: 1)

Frank: As I've shown before, and have repeatedly pointed out, by defining a universal type, you can do anything in a statically typed language that you can do in an untyped language

In your example you defined the universal type like type t = String of string | Int of int | Bool of bool. Is there a way to get rid of the tags, so that one could write just "foo" and not String "foo"?

Frank Atanassow - Re: The Case for First Class Messages

5/19/2004; 11:23:27 AM (reads: 2245, responses: 0)

No, because "foo" is of type String while String "foo" should be of type Univ (or whatever). It's only a syntactic issue, though; if Haskell supported some way to redefine the lexical syntax, you could define new syntax for Univ literals.

Probably you would want to quote them somehow to reuse the syntax for typed literals, so, '"foo" means String "foo" and '3 means Int 3. If you are willing to write univ rather than ', you could define a type class:

class Universal t where univ :: t -> Univ
instance Universal String where univ = String
instance Universal Int where univ = Int
instance Universal (Univ -> Univ) where univ = Fun
instance Universal Univ where univ = Quote

Peter William Lount - Re: The Case for First Class Messages

5/19/2004; 12:05:36 PM (reads: 2212, responses: 1)

Avi, I'm not aware of Patrick's knowledge or lack there of regarding Smalltalk. I found his comments quite usefull as a starting point for an exploration of the "thin layer of magic" and how different languages deal with it.

If you could please exand on your definition of the line and why you don't think it's clear?

While it seems clear that we see Smalltalk differently, I'm not quite sure what you mean when you say that. So, how do you see Smalltalk?

Yes an implementation of Smalltalk could model all aspects of the virtual machine in the language itself and enable you access to all of the internal components: compilation, method tables, stack frames, garbage collection, object headers, method dispatch, primitives and any other parts of interest.

I would agree that Meta Programming in Smalltalk is just like regular programming. That is a direct result of the meta data being represented as first class objects that can be manipulated like any other objects. Yes, not all the meta data from the virtual machine is accessible.

Slate Smalltalk is making an attempt to eliminate the "virtual machine" altogether. If successful will this mean the elimination of the "hard barrier" that you speak of? Not likely as ALL software that uses current microprocessors has the problem that you speak of relating to "the metalevel for machine language": the metadata of the machine is ultimately inaccessible. Even if you are using a Field Gate Array Programable chip that you can reprogram at will there will always be some "meta level" that you can't access as a result of the processor hardware exising in reality to get the work done.

In addition, in a vm-less system there will be parts that are very sensitive to change, such as the message dispatch code, and any changes could instantly break the system. Well, this is also true of Smalltalk and some other truely dynamic systems like LISP and Forth. The point is that even with access to all parts of a system you realistically may not be able to really change it that much or at all without seriously affecting the system's function or performance. But I suppose at least you'd have a chance. So it's a balancing act to determine how far down this road to travel.

The "meta level" for assembly language "bottoms out" with the machine as you say. An important aspect of meta programming in Assembly language should allow for dynamic (and easy) access to writing and manipulating assembly language programs from within assembly language. Most assembly language systems are not designed this way, yet you could do so if you wished.

The question I have is of what relevance is this bottoming out in your view? Obvously things bottom out and there will be a "layer of magic" to make things work. If a system like Slate Smaltlalk succeeds and brings all of this "magic layer" into the language itself or it's object library have we really gained anything? I think so.

Why is "the language as a specification wholly uninteresting" to you? Where is your interest?

I consider the "syntax" of Smalltalk as a limiting factor that shapes, to a high degree, what is possible to express in the system and certainly how it's expressed. Many different "less limited" implementations of virtual machines (or virtual machineless) systems could be built that implement Smalltalk-80 or the ANSI Smalltalk standard.

I've found the language specification of Smalltalk-80 of interest because with some minor adjustments to the grammar the expressiveness of a Smalltalk like language can be increased significantly. In addition a refactoring of the object library can see substantial improvements as well, including in access to the majority of the meta data in the virtual machine.

The Smalltalk syntax is of great interest since it pushes much of the "magic layer" from the language compiler, where it most certainly is hidden in the vast majority of langauges, to the run time environment and library of objects that are accessible to the running programs themselves. Too few languages do this.

One of my interests are programs that write programs. This is an area that needs major improvements, and not just in Smalltalk, but in any language system that writes programs.

I'm not just for First Class Messages, access to the Message Sending System (message dispatcher), but I'm also for Fully First Class Methods and Blocks at the expression level. I'm for Meta Programming. I'm for better implmentations, vm or vm-less, of the language and it's variants. I'm for many versions of Smalltalk from many vendors and open source projects to support a growing eco-system for more powerful and literate systems.

Ehud Lamm - Re: The Case for First Class Messages

5/19/2004; 12:24:13 PM (reads: 2208, responses: 0)

re MP in ST: Does anyone know where the tutorial on Behavior mentioned here in the past can be found these days?

Peter William Lount - Re: The Case for First Class Messages

5/19/2004; 12:42:49 PM (reads: 2200, responses: 2)

Frank wrote: by defining a universal type, you can do anything in a statically typed language that you can do in an untyped language What's the point of having types if you don't use them!

People talk about delaying things until run-time as if it were an advantage. But it is clearly always easier to do things at run-time than to do the same things statically. It actually lowers the bar.

It's an advantage for all the capabilites that are gained that can't be decided statically. It's not always easier to do things at run-time. Self goes to great lengths to figure out the types of parameters at run-time and generate optimized methods based on that potentially changing information. Since there are many things that can be done at run-time that can't be done statically the bar is raised in dynamic systems that do their work at run-time. In addition since performance is usually a concern at run-time dynamic languages have a high standard to reach and thus a higher bar to cross.

anything done dynamically in an untyped functional language can be done dynamically in a typed functional language, because of Turing-completeness.

I suppose, but Turing-completeness isn't all it's cracked up to be. Just look at the twists and turns involved in translating applications from one language to the next. More often than not the result is, well, cryptic ick.

It is untyped languages that are playing the game of catchup!

You are asserting that "typed" languages are superior to "untyped" langauges. Could it be that there is simply no valid need for "typed variables" in a language? Could it be that languages like LISP, Smalltalk, Self, Forth, Postscript, and others get along fine without "types"? Yes types give you some benefits but they also cost a tremendous amount in terms of extra development time and increased and bulkier code sizes. Essentially the gains from types just don't provide any worthwhile all compelling advantages. In fact they often seriously hinder development.

Typed code tends to be less generic, thus less reusable and as a result more code is required to express the same ideas. As we all know, more code more bug potential. A primary argument for types is that they help stop bugs. Ironically it seems that it's quite possible that "typed" code begets more code which "requires" types to keep it safe which begets more code... an icky upward and self reinforcing spiral that eventually contrains the programs and programmers expresiveness to the breaking point. The end result of typed programs tend to be highly brittle systems.

It's not that programming typeless systems are perfect, anyone can make a mess anywhere, but if you can get the results without using types why use them at all? For the occasion that you need to check a type you can do so at run-time and gain the same "type safety". More often than not static type safety doesn't provide you anything other than an unnecessary and expensive illusion.

andrew cooke - Re: The Case for First Class Messages

5/19/2004; 12:45:45 PM (reads: 2205, responses: 1)

but they also cost a tremendous amount in terms of extra development time and increased and bulkier code sizes...illusion...

[ducks behind wall]

Peter William Lount - Re: The Case for First Class Messages

5/19/2004; 12:54:02 PM (reads: 2199, responses: 0)

[ducks behind wall]???

Typed code is more specific, especially when using large object libraries that use "non-universal" types. Since typed code is more specific you need to write more of it to cover all the possible types, thus the program bloats unnecessarily. It's unnecessary since an equilivant dynamically typed program would be smaller. Not illusion, fact.

Let's see why with a brief case study that supports my assertion that "typed" languages require more code to be written.

For many years I programmed in Objective-C on the NeXT OpenStep systems (now MacOSX). Objective-C is a typed language that also borrows a subset of Smalltalk style message passing. Naturally as someone with lot's of Smalltalk experience and a preference for dynamic typing I would use the universal type "id" (Object) in all of my code. Even though I did this I was still forced to use C types for all of the non-objects. So in this mixed language I have some choice about types but also some requirements that I must follow. Ok, no sweat I can survive.

The commercial programs that I wrote tended to be larger in Objective-C than in Smalltalk. This was a result of always having to "type more" (no pun intended) characters to define the types of all the variables in all the methods and function prototypes. In addition the methods tended to be upwards of twice as long as equilivant Smalltalk methods would have been with a Smalltalk version of the (otherwise) excellent OpenStep Application and Foundation Kits. This is a result of the way that Objective-C handles types and this impacts the design of software.

I spent a lot of my debugging time chasing down silly typo's that prevented the program from compiling or running. About half of these were simple errors related to "variable types" of one sort or the other. If I'd been using Smalltalk I would have saved much of that time and likely delivered the system faster. That would have given me a larger profit on a number of the jobs and cost the client less on some of them as well.

There are real practical financial consequences to using a typed language. I'll take the untyped language that gives me the financial and competitive edge.

Matt Hellige - Re: The Case for First Class Messages

5/19/2004; 1:35:51 PM (reads: 2185, responses: 0)

[ducks behind wall]

Seriously...

Peter William Lount - Re: The Case for First Class Messages

5/19/2004; 3:39:21 PM (reads: 2134, responses: 1)

Let's take another example from the real world. The Das Kapital project at JPMorgan uses Smalltalk and has over 14,000 classes in it. A very large system by any standard. It is a highly successful project making them money hand over fist. The Smalltalk team, which I was a member of, took over from an OpenStep team and within a year created a highly complex and very successful financial trading system. We couldn't have done that using a "typed language". In fact one of the primary reasons JPMorgan switched from Objective-C to Smalltalk was because of the extra prodcutivity gains from the "typeless" Smalltalk language. In fact the "typed language" attempt to the project had failed. Smalltalk was there to save the day and produce results very fast since time was lost with slower development tools and it's typed language! The results at JPMorgan speak very loud indeed.

I have a question for those of you who like types, when you have classes why do you need types?

Anyway the point of this thead is having messages be first class objects by way of having rich meta data for messages. I support these kinds of improvements in all languages that can make it happen.

Daniel Yokomizo - Re: The Case for First Class Messages

5/19/2004; 6:50:24 PM (reads: 2123, responses: 0)

I have a question for those of you who like types, when you have classes why do you need types?

1. When we don't have classes because inheritance is a wrong solution to a fuzzy problem (i.e. prototypes and delegation instead of classes and inheritance).
2. Classes mix several concepts (i.e. code reuse, classification and data layout) while types are semantically simpler because they work with only one concept.
3. Types are the best way to make code properties explicit, so with an advanced type system you can write less code than you would with a unityped language (e.g. show . (/2) . read), also you can state invariants that may be impossible to test but are possible to check statically.

[on edit: in 1 types where a typo, corrected to classes]

Andris Birkmanis - Re: The Case for First Class Messages

5/20/2004; 1:38:59 AM (reads: 2076, responses: 1)

"Oh gawd, not this discussion again." --Frank :-)

I am presuming good will though.

I have a question for those of you who like types, when you have classes why do you need types?

When you have apples why do you need oranges?

As Daniel already mentioned, classes are a bundled offer. Being of mostly Java background, I usually do not consider data layout seriously, so my definition of (badly overloaded) classes is: unit of code reuse, classification, and instantiation (meaning identity is coupled to instances of classes). I think these roles should be played by mixins, interfaces, and classes correspondingly (if one does not want to change language more drastically, at least). In other languages mixins may be called modules or components, interfaces may be called types or specifications, and classes, well, there may be no classes :-)

In fact, it is possible to find languages located at any point of the space with these three dimensions. By bundling all the three aspects together one kills a lot of diversity and freedom in the design of a PL.

Ehud Lamm - Re: The Case for First Class Messages

5/20/2004; 3:02:03 AM (reads: 2070, responses: 0)

A simple example is binary methods. When the module system is separate from the inheritance mechanism they become easier to write.

But obviously if what you are asking is why classes shouldn't be used to imitate types, the reason is static type safey. If you don't want that, this reason won't satisfy you I guess...

Peter William Lount - Re: The Case for First Class Messages

5/20/2004; 3:48:07 AM (reads: 2059, responses: 1)

Yes, there are benefits of "prototype" objects over class based instances. Yes, classes are bundles of goo that mix several concepts. That wasn't what I was asking. Also, we are not talking about "apples" and "oranges", were talking about "typed" or "untyped" variables which are two ends of a spectrum of choices. Let's refine the question a little further.

When you have objects (or entities or whatever you want to call them) that carry "type information" via their "prototype" or "classification" along with them why do you need "typing information" to constrain "variables" in object instances, method parameters and method variables?

Why is it so important to "type" variables? Why is it so important to constrain a variable to be an "integer" or an "array" when the object contained in that variable knows it's own "type" or "class"?

What code properties are made explicit with types? How can "types" be "the best way" to make code properties explicit? How are types "semantically" simpler?

If you don't use "objects" that carry their own type information then how do you do "object oriented programming"?

There seem to be two sides of the coin when it comes to variables. Having the variables carry the "type" information and having the thing that the variables point to carry the "type/class/prototype" information. Both kinds of systems exists including systems that do both. Many successful applications and systems are in major operation that use these various approaches. There are differences between these approaches that affect not only the source code that is written but also the thinking styles of the people involved. Or perhaps it's the thinking styles of people that lead them to choose the particular language format. In any case there are major camps and much has been written slinging which is better back and forth. The point is it comes down to some basic choices and human values about what is important. It is a worthwhile subject to pursue and learn since it dramatically affects the design of not only computer languages but also operating systems, applications, software quality, development time and costs, the scope of applications, and ultimately the economics of the majority of those involved.

The "typed" or "untyped" variable discussion is relevant to the "case for first class messages" since they are both questions about "meta data" and how it's handled in the language and library of objects that come with the system.

The arguments for "typed variables".

1) Types allow "static" compile time "code" performance optimizations.

2) Types allow "static" compile time "code quality" by ensuring that "operations" only are done to "variables" of the "correct type".

3) Types are "semantically simpler" because they work with only one concept.

The arguments for "untyped variables".

1) Since objects carry their own "class" or "prototype" information with them it's "redundant" and "unnecessary" to "type variables".

2) When variables are "typed" the flexibility of methods is reduced and constrained. This is the purpose of typed variables. As a result methods become "more specific" by limiting the "types" of objects that can be "put into" a variable (instance variable, method parameter variable or temporary method variable). By becoming more specific with respect to what can be put into variables the methods can't be used in as many "contexts" with as many kinds of "objects". (If you are only using one "universal type" through out an entire then you are in part not really using typed variables but you might not have the "dynamic binding" capabilities depending on the language).

If a variable is typed it can only take objects of that kind. For example if we type a counter variable as an integer as follows we can only store integers in it.
aCounter Integer;
Now this method can only have integers in this variable. What if a case comes up that needs to use more generic "numbers" such as floats or fractions (which are supported in Smalltalk). The method is limited to "integers" and won't work thus another method would need to be created thus increasing (doubling) the code size. Yes, you could type the variable as a "Number" and not have to "duplicate" it, but now you have a variable that can't handle other types of "Magnitudes" (or other types of ordered sequences).

When variables are typed in a large group of classes the method parameters (function prototypes) are limited and often there are multiple function prototypes created to allow for more each type. Thus you end up with more methods.

3) It's faster to write code that is "typeless". It leads to, in many if not the majority of cases, to methods that are shorter and more generic.

It's possible to write Smalltalk methods that are overly specific. It's a bit of an art to write methods that are generic. Often the first few iterations of the methods of an object are too specific. Overtime as an object is recognized to be useful in more "contexts" it's code can be evolved into a "more generic" style. Sometimes methods themselves are split off of an object to become an object of it's own. (A verb (method) becomes a noun). More generic methods tend to be smaller and are often fragmented into multiple methods enabling even more code reuse thus reducing overall code size even further. Typical method sizes in Smalltalk itself are on the order of under nine lines. This was actually a unofficial guideline of the original Smalltalk team in how to recognize opportunities to gain from fragmentation code reuse. When small methods are "typed" they limit their "reusability" since they have use only with certain types that are passed into them. By having generic types methods become usable in more contexts thus increasing code reuse. Of course this depends upon the purpose of the objects in question.

Types are often stated as a way to limit the use of objects to those intended by the original designers. In some languages "static classes" are declared locking down the functionality of these classes in the name of "safety". This is at the expense of reuse. If a "static class" is too limited then uses will have to "reinvent" the majority of the code by writing a new class that is a variant to get the job done. If the class wasn't declared static it is often possible to "expand, evolve or iterate" it's design to include the new context into new areas that were not though of before by the original designers. Obviously this is more difficult with more complex objects that have complex algorithms but in many cases it enables a class to take on a few more capabilities thus reducing the overall number of classes thus reducing code size and increasing the reliability of the system by having less code to debug and maintain. Untyped code tends to be less cryptic and easier to read.

Limiting what ends up in a variable can easily be handled at run time with dynamic tests of the "class" or "prototype" information that objects carry themselves. This is most often done for object data validation especially in places where information is coming from "input" sources. For example, in a Graphical User Interface window it's often important to check and filter the kinds of data that are input into certain fields.

There really are very different styles of programming when it comes to typed and untyped lanugos. These styles are dramatically different although they might seem similar.

The meta data carried by objects themselves provide a rich source of information that can be accessed to provide "type protection" for type sensitive sections of code.

What is certain is that successful systems can be built in both typed and untyped languages. The author asserts, from over twenty fives of programming experience, that untyped languages offer a more productive environment to build programs in. This is a point of view shared by many in the industry. It's also clear that many people are fervent about this issue. Many in the industry take the point of view that typed languages are superior and build successful systems with them. The proponents of untyped languages believe that their successes and productivity gains speak for themselves. Ultra large systems like JPMorgan's Das Kapital speak for itself. In systems like Das Kapital where you have tens of thousands of classes it's usually important to be able to comprehend as many of them as you can so you can maximize your use of the system. It's also important to keep the numbers of classes (or prototypes) as small as possible so that people have a chance of comprehending as much as they possibly can. As we've seen code size, number of methods, and numbers of classes/prototypes all can be reduced using generalized "untyped variable" writing style.

Andris Birkmanis - Re: The Case for First Class Messages

5/20/2004; 4:46:11 AM (reads: 2063, responses: 0)

I only hope there was no trolling...

1) Since objects carry their own "class" or "prototype" information with them it's "redundant" and "unnecessary" to "type variables".

The assumption that values know their types at runtime is pretty strong. I am not against run-time type info in general, but in many cases it can be proved by compiler that it is redundant and unnecessary to check for a type of the value at runtime.

2) When variables are "typed" the flexibility of methods is reduced and constrained.

But not necessarily more than it's needed to carry out their responsibilities. If your typing is structural (as opposed to nominal), or your PL has implicit parameters, or type inference, the difference between the set of values that could meaningfully be bound, and the set allowed by typing is very small.

3) It's faster to write code that is "typeless". It leads to, in many if not the majority of cases, to methods that are shorter and more generic.

Did you hear about type inference?

In systems like Das Kapital where you have tens of thousands of classes... It's also important to keep the numbers of classes (or prototypes) as small as possible...

I don't consider tens of thousands of classes to be speaking for your cause. Do you imply that if written using a PL with static type checking there would be more conceptual entities?

Mark Evans - Re: The Case for First Class Messages

5/20/2004; 3:41:51 PM (reads: 1998, responses: 4)

More Peter Wiliam Lount philosophy (no value judgment implied by my post):

What is the simplest syntatic [sic] language needed to provide a full featured computer language that crosses the Wolfram Computational Equilivance Threshold where simple systems become capable of generating systems as complex as any more complicated language?

Peter William Lount - Re: The Case for First Class Messages

5/20/2004; 8:39:03 PM (reads: 1981, responses: 0)

Andris wrote: I don't consider tens of thousands of classes to be speaking for your cause.
The application covers a very wide and deep application area. As I said it's very large. The point is that with very large systems it's better to have a language that has less "clutter" and "baggage" in each method. Typed variables are unncessary and just add clutter.

Do you imply that if written using a PL with static type checking there would be more conceptual entities?
Yes that's what I'm saying. There is a significant tendency of languages using statically typed variables to have more conceptural entities. This is caused by the tendency towards more and larger methods which leads to more complex classes. More complex classes are more difficult to add to since they contain more code. In addition in languages with "static classes" developers are often forced to duplicate functionality since they can't access the internals of the locked down "static classes". So they add a new class instead of enhancing an existing one.

Andris Birkmanis - Re: The Case for First Class Messages

5/20/2004; 8:53:22 PM (reads: 1982, responses: 0)

where simple systems become capable of generating systems

Meaning that human factor is obsolete? Just run the simplest language that could possible work overnight, and you have the code (source or binary?).

Peter William Lount - Re: The Case for First Class Messages

5/20/2004; 11:42:02 PM (reads: 1971, responses: 1)

Andris wrote: Meaning that human factor is obsolete?
No. Wolfram has discovered that there is a threshold that when crossed enables very simple systems to generate complexity as complex as any complex system can generate. See "A New Kind of Science and the Future of Mathematics". I was not, however, intending to suggest replacing humans. I was intending to illustrate that languges with a simplier syntax can do as much as those with a more complex syntax.

Just run the simplest language that could possible work overnight, and you have the code (source or binary?).
No, I was not talking about autonomous "AI" program generators. That wasn't a meaning I was intending for people to get. It seems that you took the quote out of it's context and it's meaning changed.

The point is that "typed variables" are not necessary in a computer language. Untyped languages demonstrate that.

Typed variables, in my view, increase the complexity of a language beyond what is necessary. I've not heard any complelling reasons for typed variables. Untyped languages are less complex yet can be used to implement complex systems. Turing and Wolfram support this. I was referring to Wolfram's discovery to support this view. Wolfram demonstrates that "less complex" systems can generate complex results that are on par with complex systems. I was attempting to point out that this applies to computer languages as well.

In addition I'm wondering if the syntax of Smalltalk can be similified even further yet maintain it's capabiliites. It's seems that there might be some wiggle room to simlify it a little. At least with some minor adjustments the Smalltalk syntax can be improved while keeping it simple.

Along these lines, I've been able to take the Smalltalk syntax and run with it to create a new language known as Zoku that integrates many new capabilies into the language and system. I'm currently developing the Zoku compiler (using Squeak as a development platform).

The article is still being edited and is a rough draft. That particluar paragraph needs to be rewritten for clarity. Thanks for pointing it out. I can see that the article needs more work.

Andris Birkmanis - Re: The Case for First Class Messages

5/21/2004; 4:07:01 AM (reads: 1955, responses: 0)

Well, I've read "The Future" after it was cited on LtU.

It amazed me that the author assigns so much value to syntactic simplicity of transformation rules, while ignoring complexity of rules interpreter.

Even if you express your program exclusively in the form of S and K combinators (and even one combinator is enough), you still are using the full power of logic theory, which is anything but simple.

I will not repeat my arguments why I think statically typed code need not be bigger or more clumsy than dynamically typed one. I can offer a narrower setting, though: why not discuss which approach is superior, passing arguments as a dictionary or as a statically typed record?

andrew cooke - Re: The Case for First Class Messages

5/21/2004; 6:16:20 AM (reads: 1940, responses: 2)

no value judgment implied by my post

really?

anyway, sounds like he would be interested in chaitin's work.

Frank Atanassow - Re: The Case for First Class Messages

5/21/2004; 8:25:58 AM (reads: 1940, responses: 0)

Peter: You are asserting that "typed" languages are superior to "untyped" langauges.

All other things being equal, for software development, absolutely, yes.

Peter, I started replying to your arguments, but the truth is (and this is why everyone is groaning) that:

I have covered this ground here and elsewhere, many times before,

after looking through your posts, it is clear to me you are ill-informed about typed programming languages, and

you are not accustomed to supporting your claims in an unambiguous manner.

Given these facts, in order for me to reply to your arguments in a way you would understand, I would have to write quite a lot and I'm not willing to do that just now.

So, I took what I wrote and started writing a (hopefully) short paper called `How to argue about types', which I'll make public when I'm satisfied with it.

For now, let me just say:

Peter: Wolfram has discovered that there is a threshold that when crossed enables very simple systems to generate complexity as complex as any complex system can generate.

It has not been demonstrated that they can generate any complex system, only certain ones of limited complexity.

Typed variables, in my view, increase the complexity of a language beyond what is necessary. I've not heard any complelling reasons for typed variables.

It's clear you haven't looked very hard. Are you even familiar with typed languages like Standard ML, Objective Caml, Haskell, Nice or Scala? Hindley-Milner type inference? Parametric polymorphism? Polymorphic lambda-calculus? Dependent types?

I'm sure you are a fine and accomplished programmer, but that alone does not equip you with the tools one needs to reason unequivocally about programming languages.

BTW, from your article:

The more complex grammar and syntax of C, C++, Perl, and Java does not give them any more computational sophistication than Smalltalk. In terms of fully featured programming lanugages, Smalltalk provides a bare minimum of capabilities needed to cross the "Wolfram Computational Equilivance threshold" or as I'll call it the "less is more threshold". This principle gives Smalltalk a tremendous expressive advantage over more complex language grammars.

I am itching to criticize this in a large number of ways, but will just remark:

This is an extraordinary claim. Can you prove it? You can elide the anecdotal evidence and dogma; I'm impervious to it. :)

Frank Atanassow - Questions for Peter

5/21/2004; 8:55:34 AM (reads: 1940, responses: 0)

Also, Peter, I would like you to respond to these questions. You wrote:

Typed code tends to be less generic, thus less reusable and as a result more code is required to express the same ideas.

Less generic/reusable than what?
Given your answer to the above, are you sure they are `the same ideas'?
There is a sense in which the integers are more generic/reusable than the real numbers. Real numbers are more complex and difficult to understand than integers, and it would be easier simply to replace each real n by its floor (or ceiling, or truncation, or whatever), and to modify arithmetic operations like division so that the quotient of two integers always produces the closest (or whatever) integer. Do you believe this is a tidier approach to mathematics and, based on this, would you recommend we abandon the reals and work exclusively with the integers?

I await your reply.

Mark Evans - Re: The Case for First Class Messages

5/21/2004; 1:44:24 PM (reads: 1905, responses: 1)

For Andrew Cooke: Chaitin, op. cit.

Mark Evans - Re: The Case for First Class Messages

5/21/2004; 1:58:03 PM (reads: 1897, responses: 4)

For Frank: An article is an excellent idea. Post it on your home page and link to LtU. In the meantime, we have the discussion with Oleg about types.

andrew cooke - Re: The Case for First Class Messages

5/21/2004; 2:48:53 PM (reads: 1898, responses: 0)

sorry, but why is this for me?

andrew cooke - Re: The Case for First Class Messages

5/21/2004; 2:55:19 PM (reads: 1892, responses: 3)

mark, were you writing/wrote a faq (the site is terribly slow, so it's difficult to check)? if so, and it didn't contain references to various typed v untyped discussions, maybe it could? (oo! - site just suddenly got fast for one page and i can see the faq and it's pretty tiny, so maybe this was scratched, i don't know...)

(incidentally, ehud, i've given up on further alternative site stuff. you're acct is still the only one i send email to that bounces, which makes it a lot of fuss...)

Ehud Lamm - Re: The Case for First Class Messages

5/21/2004; 2:59:07 PM (reads: 1896, responses: 1)

(incidentally, ehud, i've given up on further alternative site stuff. you're acct is still the only one i send email to that bounces, which makes it a lot of fuss...)

Very strange. I'll send you an alternate address. I hope the reason for giving up isn't my email account. It would be quite a lame reason, if this is the only problem...

Patrick Logan - Re: The Case for First Class Messages

5/21/2004; 3:21:06 PM (reads: 1885, responses: 2)

I will not repeat my arguments why I think statically typed code need not be bigger or more clumsy than dynamically typed one.

The operative phrase being "need not". The best "statically typed" languages (e.g. I would say Haskell, Clean, O'Caml in my experience) are very expressive. I simply do not know enough about them to say what it would take for me to be as productive with them as I am with the best "dynamically typed" languages (e.g. I would say Lisp, Smalltalk, Python in my experience).

So to put it another way, I don't know where the edges are in these languages vs. where the learning curve is for me personally. I've only done small samples in them. I see more stuff coming from these languages, but I've yet to be convinced a decent programmer will be X percent more productive, where X is an interesting number, say, X >= 2. For X < 2 the developer familiar with the older languages are better off staying where they are as the newer languages are improved.

I can offer a narrower setting, though: why not discuss which approach is superior, passing arguments as a dictionary or as a statically typed record?

All things being equal I would choose the statically typed record. But what I have found in practice is that all things are not equal. In any case, this isolated example is too narrowly contrived for a useful discussion. Obviously in Smalltalk, Lisp, etc. we do not pass dictionaries around as the single argument to all functions.

Ehud Lamm - Re: The Case for First Class Messages

5/21/2004; 3:27:10 PM (reads: 1886, responses: 1)

How about this being the root cause of these debates: Any language design issue can be framed as a debate about type systems.

Sjoerd Visscher - Re: The Case for First Class Messages

5/21/2004; 4:29:58 PM (reads: 1870, responses: 1)

About typing variables, is there any reason why the following code would be invalid:

tmp = 1
a = tmp + 1
tmp = "hello"
b = tmp ++ " world"

In other words, if you have static type inferencing, should it be smart enough that the tmp variable contains values of different types at different times?

Daniel Yokomizo - Re: The Case for First Class Messages

5/21/2004; 7:34:32 PM (reads: 1856, responses: 0)


test = fromJust $ do
                     let tmp = 1
                     let a = tmp + 1
                     let tmp = "hello"
                     let b = tmp ++ " world"
                     return (a,b)

Works in Haskell. As you said different types at different times implies ordering so I used the monadic notation to make this explicit. Now if you want to use a real variable (i.e. IORef in Haskell) then you would have a problem, because what would happen to another piece that has a reference to "tmp"? What's the type of "tmp" for them? If the usage is local and time ordered we can use different scopes (e.g. my example) to achieve this, otherwise we can't mix the two types safely.

Chris Rathman - Re: The Case for First Class Messages

5/21/2004; 7:34:32 PM (reads: 1850, responses: 0)

Given that you are effectively using tmp as a storage value, you'd have to rule out using lazy evaluation - the reasoning becomes time dependent, possibly losing referential transparency with side effects introduced.

Peter William Lount - Re: The Case for First Class Messages

5/21/2004; 8:22:49 PM (reads: 1853, responses: 0)

My views are not intended to be unequivocally absolute statements. Please do not interpret them that way.

Andris wrote: It amazed me that the author assigns so much value to syntactic simplicity of transformation rules, while ignoring complexity of rules interpreter.

I agree that there is a balance to watch. Having a rules interpreter that is too complex isn't desirable either. This is why "simplistic" grammars, while interesting and important from a theoretical point of view, aren't practical for everyday programming. Yes, I'm interested in minimizing the complexity of language syntax since I feel that there is a correlation between the complexity level of a syntax and the productivity of programmers and the end results that they produce using it. Yes, I've asserted that the more complex the grammar the less productive programmers are likely to be. Yes, I'm interested in languages that reduce complexity if possible.

In fact thanks for mentioning this balance. I've encountered this phenomenon in my work creating the Zoku language which is a variant of Smalltalk. In these explorations there have been cases where I've simplified the grammar too much and that pushed the "work load" over to the virtual machine at runtime (or if you wish the rules interpreter). So I'm walking this line to find out where it goes. Some of the results have been unexpected and yet acceptable, while others have tipped the balance too far one way or the other. Language design takes work and some willingness to travel where the road takes you to see if that's were you really want the users of the language to be required to follow. Yes, this is subjective.

I will not repeat my arguments
Could you provide some links to them?

... I think statically typed code need not be bigger or more clumsy than dynamically typed one.
As a start just having the type specifications on each variable makes it somewhat bigger. Type constraints in languages like Smalltalk would simply limit the code too much and as a result increase it's size at the same time as decreasing it's generality. This is a tendency not an absolute. In my experience Smalltalk methods are frequently smaller and more generic than those of Objective-C, C++ and other typed languages of a similar ilk.

... why not discuss which approach is superior, passing arguments as a dictionary or as a statically typed record?
Well if that's something you'd like to discuss then... it depends on what you mean by "superior".

Frank wrote: after looking through your posts, it is clear to me you are ill-informed about typed programming languages
That is your opinion, and it's inaccurate. My views are based upon twenty five plus years of experience using typed and untyped computer languages in real life practical projects. Also, there is no need to get personal by labeling me "ill-informed". Please see argumentum ad hominem.

you are not accustomed to supporting your claims in an unambiguous manner.
Ah, getting personal again. And just how would you know what I'm accustomed to?

If there are ambiguities in my arguments please ask me questions to clarify them and I will do my best to answer them.

Given these facts
What facts? Your two assertions about me are not facts but your opinion. Then you simply state that you've covered these grounds before, ok I'll accept that your statement about that, but what specifically are you referring to? You seem to be attacking me and my arguments but you don't actually present yours or specific links to your arguments? Could you at least provide a link?

I would have to write quite a lot and I'm not willing to do that just now.
Fair enough. Take your time to prepare a thoughtful article.

It has not been demonstrated that they can generate any complex system, only certain ones of limited complexity.
I would generally agree with that it's not been proven that they can generate any complex system. However Wolfram has demonstrated that they can generate systems of equal complexity as any complex system.

A language like Smalltalk is obviously more complex than the systems described in Wolfram's work. My point is that I've not seen any compelling arguments for the addition of typed variables to Smalltalk. What is the compelling benefit of typed variables?

It's clear you haven't looked very hard.
Again getting personal. How would you know how hard I've looked? This is obviously your opinion about me again. Can we stick to the topic and stay away from opinions about each other?

Are you even familiar with typed languages like Standard ML, Objective Caml, Haskell, Nice or Scala? Hindley-Milner type inference? Parametric polymorphism? Polymorphic lambda-calculus? Dependent types?

I'm familiar with these languages and ideas. They certainly have very interesting chrematistics and in some of these languages it makes sense to have and use typed variables. Types do seem to make some sense in functional languages especially when you don't have objects to hold the type information. In no way am I saying that these systems are not "consistent" or don't provide benefits, they do work and provide value.

Let me rephrase my statement to make it more specific.

"Typed variables, in my view, increase the complexity of languages, such as Smalltalk, Java, C# and C++, beyond what is necessary. I've not heard any compelling reasons for typed variables in these kinds of languages."

I'm sure you are a fine and accomplished programmer, but that alone does not equip you with the tools one needs to reason unequivocally about programming languages.

Again you are getting personal by making jugemental statements about me.

If you took my statements to be "unequivocally" and "categorically absolute" statements about programming languages that is not the context or meaning that I intended. The statements I've been making are in the context of languages such as Smalltalk, C++, Java, C#, etc. I've not been intending to make comments about functional languages. In retrospect that might be how it's interpreted on a blog entitled "lambda the ultimate". It's clear that typed variables and typed languages offer benefits in some circumstances. I've simply been asking what those benefits are and so far the responses I've gotten have provided little if any substantial discussion about the actual benefits of types and typed variables.

I wrote: "The more complex grammar and syntax of C, C++, Perl, and Java does not give them any more computational sophistication than Smalltalk. In terms of fully featured programming languages, Smalltalk provides a bare minimum of capabilities needed to cross the "Wolfram Computational Equilivance threshold" or as I'll call it the "less is more threshold". This principle gives Smalltalk a tremendous expressive advantage over more complex language grammars.

Frank wrote: This is an extraordinary claim. Can you prove it?

"The less if more threshold" is more of an observation than an "extraordinary" claim. This concept is still under development. The article the quote comes from is a draft still being edited and is labeled as still being edited. I thank you for your expert feedback.

There is no doubt that Smalltalk is a capable object oriented programming language. It does quite nicely using classes and untyped variables. Smalltalk's typeless grammar is simpler than the typed grammars of C++, Java, C# and other similar languages and yet complex systems can be built with Smalltalk as with those other languages. It seems to me that the existence of working applications built in Smalltalk is direct evidence that typed variables are not necessary in Smalltalk. What more proof do you need than that, that typed variables are not necessary in this kind of language?

Look, if there is a compelling benefit to adding typed variables to a language like Smalltalk I'd gladly accept it to gain that benefit. What is the benefit?

In my view, the argument that is put forward most often, type-safety, just doesn't provide a compelling benefit in the vast majority of real world applications, and it actually provides a disadvantage in most cases.

Maybe it's a good idea to add some extra assurances in a Nuclear Power Plant control system, but that's not the typical kind of application that most programmers are dealing with. If you want that "type-safety" benefit, just be willing to pay for it in your project schedule and costs.

The very large application, JPMorgan's Das Kapital, was built using Smalltalk, a typeless language. One of the reasons Smalltalk was chosen was specifically because it was typeless! The people in charge of the project made a technical decision to go with Smalltalk over other solutions because they felt - from their experience using both - that a typeless language was a better solution tool for their business requirements. It worked as the program is a huge success help JPMorgan staff make lots of money.

You can elide the anecdotal evidence and dogma; I'm impervious to it.
I'm not attempting to convert you. If you feel that typed languages are better that's your choice. I am simply attempting to understand what the compelling reasons are for your seemingly pro-typed choice. If what I'm writing is coming across as dogma I apologize as that was not intended. I will attempt to take your comments into account in my writing. Yes some of my evidence is "anecdotal". All of it is based on case studies or direct experience in using typed and untyped languages. However, I know of no "double blind" studies that have been applied to computer programmers and typed and untyped computer languages to determine which is "superior" or "better". Thus we are left to other means to make these determinations for ourselves. Obviously people are coming to different conclusions on the question of typed variables. I'm wanting to understand why that is the case as it has a direct impact upon the systems I build, the people I work with, and the new language that I'm creating.

The following questions were asked by Frank about my statement: Typed code tends to be less generic, thus less reusable and as a result more code is required to express the same ideas.

1. Less generic/reusable than what?
Type variable code is less generic/reusable than untyped variable code, of course. Simply put, typed variables constrain the objects that can be stored in them. By definition this limits the software. Therefore it's less generic and reusable. What happens is that typed variables limit the use cases that any particular code can handle and additional code is needed to handle additional use cases (many of which only emerge later in the design and implementation cycle). In many cases methods can be "generalized" to handle all of those cases. In many cases not using typed variables in the first place is enough to "generalize" the method.

2. Given your answer to the above, are you sure they are `the same ideas'?
Yes, I'm comparing typed variable code with untyped variable code. They are comparable. I am not comparing "typed variables" with "types".

3. There is a sense in which the integers are more generic/reusable than the real numbers. Real numbers are more complex and difficult to understand than integers, and it would be easier simply to replace each real n by its floor (or ceiling, or truncation, or whatever), and to modify arithmetic operations like division so that the quotient of two integers always produces the closest (or whatever) integer. Do you believe this is a tidier approach to mathematics and, based on this, would you recommend we abandon the reals and work exclusively with the integers?

No that sense isn't what I've been talking about. The interpretation that you are suggesting has absolutely nothing to do with any of my arguments. IF you've been viewing my comments from the point of view indicated by this question then no wonder you aren't getting what I'm saying. Either my writing isn't clear (likely since the article is a draft and this is a blog "comment") or I've simply missed discussing an important topic that needs to be included in the discussion so that everyone is on the same page. Maybe were can begin making some progress now in communicating with each other.

It's clear that in a language like Smalltalk the "type" information of an integer or a real number is shifted from the "type" of the variable to the object contained within the variable. In the case of Smalltalk an instance of an integer or a real (a "Float") "carries" it's own type information with it's class. The type information isn't lost, it's simply shifted from the variable to the object contained in the variable. In a class based language the "class" takes over this role while in a "prototype" based language, like Self, the prototype and it's "traits" objects take over this role. This shifting of "responsibility" for the "type" information is what enables a language such as Smalltalk to have "untyped" variables and thus to be an "untyped" language.

Once the "type" information has been shifted from the variable to the object contained within (more accurately pointed at by the variable) the variable no longer "needs" any "variable type" information to perform valid processing. Smalltalk proves this in spades. In essence in Smalltalk all variables (instance variables, method parameter variables, method temporary variables, class variables and global variables) are universally typed as "Object".

The untyped variable point of view is really very simple, and it does not mean destroying distinctions between types since those kinds of distinctions are maintained within the objects classes.

I hope that I've address your questions with some clarity.

Mark Evans - Re: The Case for First Class Messages

5/21/2004; 10:53:22 PM (reads: 1824, responses: 0)

sorry, but why is this for me?...faq...?

For you in order to clarify that it's not responding to Frank's preceding math example, because you mentioned Chaitin, and to avoid any connection to Wolfram's work, best discussed here.

You're right about the FAQ, I still owe it to Ehud. If you feel an urge to write one, go for it...

Mark Evans - Re: The Case for First Class Messages

5/21/2004; 11:16:49 PM (reads: 1826, responses: 0)

The root cause is that the mess we call C++ gave a bad name to types. As Andris and Frank hinted, static type inference remains a great unknown in some circles -- present company excepted.

Andris Birkmanis - Re: The Case for First Class Messages

5/22/2004; 2:24:03 AM (reads: 1778, responses: 0)

Whoa, we should record this thread as a longest since I came to LtU :-)

Seriously, aren't we discussing several topics here? What about splitting them into fresh threads and post links here?

Like:

Complexity (looks more like philosophy than comp.sci. to me)
How to argue about types (FAQ or an article)
Design ideas for Smalltalk/Zoku (sounds interesting)
Power of syntax (cognitive issues?)
Aspects of types (taxonomy of stuff like type inference, dependent types, implicit parameters, HOT, multi-staged types, specifications, and which languages have what to which extent)

And more others, including the statement that stirred the pot: "The more complex grammar and syntax of C, C++, Perl, and Java does not give them any more computational sophistication than Smalltalk". And no more than Scheme, I would add ;-) Yes, I indeed originally missed the point, sorry, Peter.

By the way, despite me visiting this blog, I still use Java as my only language for living. And not because of its technical merits.

Ah, and I suggest to all the interested in this discussion to read something like BenefitsOfDynamicTyping so we get a "clueful religious debate" :-)

Sjoerd Visscher - Re: The Case for First Class Messages

5/22/2004; 3:58:11 AM (reads: 1767, responses: 1)

So there is no argument about typing variables, just a different point of view? It makes sense in functional languages, but not in imperative languages.

Can we all agree on that, and if so, is this common knowledge in certain circles? I haven't heard it before.

Andris Birkmanis - Re: The Case for First Class Messages

5/22/2004; 4:04:08 AM (reads: 1766, responses: 0)

It seems like the argument about typing variables is easily solved: It makes sense in functional languages, but not in imperative languages.

I would rephrase this as: there is a typing of names and there is a typing of locations. Both may be benefitial or not depending on the whole design of the language.

Sjoerd Visscher - Re: The Case for First Class Messages

5/22/2004; 4:09:04 AM (reads: 1760, responses: 1)

there is a typing of names and there is a typing of locations. Both may be benefitial or not depending on the whole design of the language.

This is a lot more vague. The separation between imperative and functional doesn't work then (in this case)?

PS. you are quick, I had rephrased my post in the mean time.

andrew cooke - Re: The Case for First Class Messages

5/22/2004; 7:46:18 AM (reads: 1754, responses: 0)

It would be quite a lame reason

you've got broken email, i told you about it, you didn't fix it. i don't see how that makes me lame, frankly. [on edit - sorry that was unnecessarily snarky. been a bad couple of days]

Andris Birkmanis - Re: The Case for First Class Messages

5/22/2004; 8:00:58 AM (reads: 1752, responses: 0)

Andris: there is a typing of names and there is a typing of locations.

Sjoerd: The separation between imperative and functional doesn't work then (in this case)?

Well, what I meant is, (most of) both functional and imperative languages have both names and locations.

Variables in some languages are just names, which can potentially refer to the same location, in some others they are identified with location itself, while in still others you have a choice.

Like in C++, int& intName or int intLocation. I consider int* intPointerLocation to be location, not name.

When talking about typing variables, it may be occasionally worth to remember, whether we are talking name variables or location variables. In some languages name variables are immutable, while location variables are mutable, which makes static typing tradeoffs differ for them.

Patrick Logan - Re: The Case for First Class Messages

5/22/2004; 11:53:18 PM (reads: 1676, responses: 1)

It seems like the argument about typing variables is easily solved: It makes sense in functional languages, but not in imperative languages.

This approach seems to be in conflict with the observation that functional languages increasingly have been incorporating "imperative" capabilities.

andrew cooke - Re: The Case for First Class Messages

5/23/2004; 6:37:22 AM (reads: 1679, responses: 0)

is it conflict? maybe we're still learning. perhaps the integration of typing with imperative languages is something that is becoming easier, but which has historically been poorly done because we didn't know how?

then one's poition on this would depend largely on what kind of experience you had - people working in industry with large projects that involve interoperation with existing standards (how many typed, modern languages have corba integration or are tightly integrated with win32 gui?) might well think this, while people working with the latest ideas and on projects that are largely stad-alone would think otherwise.

ocaml is getting there, and .net might move things along, but otherwise...?

Peter William Lount - Re: The Case for First Class Messages

5/23/2004; 6:16:23 PM (reads: 1643, responses: 1)

Ya, it was a pot stirring statement Andris! Thanks for letting me know that the clarification worked... that's big of you.

Yes we are certainly discussing many topics (your list is excellent).

Oh, a long time friend and Smalltalker told this to me today "I don't want to have to ignore a ton of junk (extra code, weird syntax extras, etc...). When working in C, C++, Objective-C, Java, etc... I have to ignore this junk." It causes cognitive dissonance and confuses people thus slowing them down.

Loved your link on dynamic typing and LSD. ;--) It definitely is related to what I've been going on about.

I found this article at Chimu, on the page you linked to. It is a good one. Why Dynamic Typing?
. Alan Kay wrote: "The short answer is dynamic typing can scale well because one tends to create much less code as the system grows bigger. As the system grows, objects will get reused in many different situations for which they work well, and the layers of "membranes" allow clients to not worry about internal details very much. It still requires a good architecture to build a big system, but the generalizable functionality of the objects is helping a lot with the design/implementation."

"Now someone could argue that static typing should have the same property: you write less code as you create more functionality. But the truth is that static typing does not just avoid/remove bad code, it REMOVES GOOD CODE too. And this good code that static typing is removing is exactly the code that makes Smalltalk so scalable: it is the code that can be reused in many different situations that were never planned for by the original authors. Dynamic typing excels because it allows this highly reusable, good code that would not pass a static type check."

Ok, so Alan supports my view, well actually the other way around since he was here first. ;--)

The questions that I really want to get to and discuss have to do with:

Why does dynamic typing works so well? How can it work? Alan provides some of the insight but I'd like to see how much further our understanding can go if possible.

The Chimu article continues with a performance analysis: The following are all rough estimates between Java and Smalltalk, but they are based on years of experience doing very similar tasks in both languages [but Your Mileage May Vary]. Smalltalk requires about 1/2 to 1/3 the number of statements within a method to accomplish the same thing as Java [this is one of the most painful aspects of switching back and forth between Smalltalk and Java/C++], so the extra code is at least 100% (2x). The next level is the number of additional methods needed because of static typing. From my experience this is probably about 20%,: one in six methods in Java would simply not need to exist in Smalltalk because they are solely solving a static typing problem and could otherwise be collapsed into the remaining five methods. The next level are additional classes, which is at least another 30%. Finally, I will end with additional "packages" of functionality which have to be rewritten or somehow significantly copied/changed to use in the desired context. Again, I would say this is about 20% or so (where the 'or so' can get really large). All totaled this is a minimum of:

2.0 x 1.2 x 1.3 x 1.2 = 3.75 (275% larger)

and could get as large as

3.0 x 1.3 x 1.5 x 1.7 = 9.5 (850% larger)

If you take out the method-statement-level multiplier (people don't seem to mind this growth as much and it is well localized) you get an 85% to 230% growth in overall system size (packages of classes of methods).

Ok, Chimu is supporting my statements with some metrics. He summarizes with: "A way to really think about the negative impacts of static typing would be to consider (as you walk around) how many things in the real world would be extremely difficult to do if static typing was enforced on them. For example, could you have a shoe-rack? (no, someone would have to 'cast' their shoes when they took them out again). Could you use a key to cut open a package? Could gas-injection cork removers exist? Heterogeneity, flexibility, extensibility, and reusability are all punished by static typing.

In languages with typed variables the code is, in a sense, also typed since it's applying "valid" operations for the type of variable that the operations apply to. In an untyped language like Smalltalk the variables are no longer typed (and the "type" information been moved to the classes), does this mean that the code operating on those variables is no longer typed? It seems that polymorphism starts to shine by taking over and providing the leverage into generality and scalability that Alan is talking about. It seems that generalized Smalltalk code is highly polymorphic.

Is there still "type" information of some kind in the "code" of a method that applies it's operations to untyped variables?

Does the name of a method imply a "type"? What if that method is defined on more than one class or prototype object? Then we have polymorphism and multiple types of objects can be successfully run through the code in question.

In Smalltalk when an object receives a message it does not understand an error is generated by the message sending system in the virtual machine which then sends the receiving object the #doesNotUnderstand error message, thus making the error visible. This error can be caught using exception handling techniques. During development this usually brings up a debugger (unless the error was specifically caught by your or a library's exception handlers). In some cases it's desirable to "intercept" this error message by implementing #doesNotUnderstand on the receiving object itself, where it can take "corrective" actions and "rewrite" the message. While this isn't used that often it's quite powerful, especially with proxy objects and other kinds of "stand-in" objects.

From my experience writing and re-writing Smalltalk (and in a few cases Objective-C) methods to be more general it seems that care must be taken to ensure that the messages sent, their parameters and their sequence of sending is appropriate for the "polymorphic" message "protocols" being used. In a real way programming methods "generically" involves stepping up the level of abstraction from the "specifics" of certain classes of objects to the "generics" of a "polymorphic" grouping of classes. Often this means moving any class "specific" code out of the method and into those classes that need it thus improving it's "genericity level" or "generality level" (to coin a phrase). Sometimes it means pulling code from certain classes and "generalizing" it to work across all of the polymorphic group of classes involved. Much of the time the members of these polymophic groups involve multiple class hierarchies at various depth levels. These techniques and others are all "refactoring" techniques that can be used to improve the "genericity" of objects. Many of them work in most languages but some are specific to "untyped variable languages".

As to the question of whether or not "function" languages "require" typed variables, I'm not certain. Are there any purely functional languages that are use untyped variables? It seems possible to have an untyped functional language, and thus doable, to me but I'm not an expert on functional languages. Which is why I'm here asking these questions hoping to gain a deeper understanding in how "types" and "typed variables" impact language design and implementation, and the people using them and the applications and systems built from them.

The Slate programming language, a variant/descendent of Smalltalk, has optional types and seems to, at first reading of their paper on Prototypes with Multiple Dispatch (PMD), use types in an innovative and effective manner. Maybe they are onto a compelling benefit and reason for types. I'll have to study their work more and try it out to find out. It certainly has some very interesting ideas.

One of the reasons that I've been discussing this topic is such depth here at Lambda the Ultimate is that I feel this community might help me with some of the choices in the design of the Zoku language variant of Smalltalk that I'm designing. I have the option of adding typed variable slots and I need to fully understand the consequences and benefits that these choices carry with them.

There seems to be two types of "typed variables": one that is checked statically at compile-time and the other that's dynamically checked at runtime.

I understand the compile time type checking quite well in terms of implementing it, and I also, as must be clear from these posts, understand the cost consequences of using static types. What I've been asking in these posts is are there any seriously compelling benefits to statically compile time checked typed variables?

The main use for "typed variable slots" that are dynamically checked at run-time are for "input validation" in code that interfaces with Graphical User Interfaces, other systems, other modules, etc... In this case the "typing" is really a set of "dynamic validation checks" carried out at run-time with no compile time checking applied. This is highly successful in a number of large applications built in Smalltalk.

In a Smalltalk program if you wish to check the "type" of an "object" that is in a variable to ensure it's of a class type that understands the messages you are about to send it you have a number of run-time class/type checking validations that you can perform. You can check the class. You can send an "isX" message, such as "isNil" or "isNotNil", where "X" is the name of a test that all objects understand. You can test if the receiving object in the variable understands the protocol of the particular message or messages you are going to send. You can get more sophisticated and detailed by interrogating the "structure" (connections to other objects) and "values" of the object in question. The dynamic run-time test code is expressed "inline" with the rest of the method.

What I'm considering for Zoku is having this "dynamic type validation checking" code associated with the "variable slots" themselves rather than inline with the methods that use them. Usually in Smalltlak this is done in the "setter methods" of the receiving object. While this works fine for objects (class or prototype based) what about method temporary variables and method parameters? In Zoku you'd be able to factor this code out of the method body and specify it for method parameters and temporary variables within methods and block closures. In some of the simple cases this might on the surface look like statically typed variable syntax, but it's all implemented dynamically.

In many ways this is similar to and could be implemented as "pre" and "post" conditions on a method (or a block closure), it's parameters or it's temporary variables. The Eiffle language has a good implementation of these.

It's very important to note that the kinds of dynamic type and value testing that are available at run-time far exceed anything that's possible at compile time. (I'm wondering if this is always the case; are there compile time tests that can't be done at run-time that also have compelling benefits?)

So do any of you have any ideas, answers or hints of answers?

Daniel Yokomizo - Re: The Case for First Class Messages

5/23/2004; 6:49:10 PM (reads: 1634, responses: 0)

Hmm, three things here:

1. The Java/Smalltalk large difference in program size is largely from the fact that its type system is primitive (e.g. no parametric polymorphism, no type inference). With better type systems the difference in size would be much smaller. If we compare Haskell and Smalltalk there's little size difference.
2. As Frank mentioned it seems that you aren't aware of static type systems other than Java/C/C++ like, if so it would be enlightening to check some (e.g. Haskell, SML, O'Caml).
3. Try to break your posts in smaller ones, grouping your ideas in each one. Currently you're writing many different things in each post and it's your point gets mixed sometimes. For example we can talk about bloat in code related to unnecessary typing information (i.e. things that a type inferencer couldn't find), what is a type and what advantages do we get when we have a Smalltalk-like system, but each topic is pretty much orthogonal.

Peter William Lount - Re: The Case for First Class Messages

5/24/2004; 12:10:08 AM (reads: 1599, responses: 1)

In response to Daniel:

1. Exactly how would a better "type system" enable Java to have smaller methods as Smalltalk enables. Please be specific about what "better" means. Also, how can a type system that provides typed variables keep code "bloat" to a minimum?

2. I am aware of those typed systems. What specific aspects of their "type" systems do you suggest have advantages that are worth while?

3. It's challenging enough to write posts that cover the points that I wish to make without being concerned about chopping them up. I try to be brief, but I like to respond to other posts with a cohesive post. Thanks for your suggestion.

Andris Birkmanis - Re: The Case for First Class Messages

5/24/2004; 1:25:03 AM (reads: 1598, responses: 0)

It's very important to note that the kinds of dynamic type and value testing that are available at run-time far exceed anything that's possible at compile time.

I think you would be interested in multi-staged programming ideas. IMHO, the whole idea of MSP is that there are things that are better done earlier (e.g., compile-time) and there are things that are better done later (e.g., run-time). But they are not limited to two stages (thus "multi").

Peter William Lount - Re: The Case for First Class Messages

5/24/2004; 6:04:59 AM (reads: 1626, responses: 1)

In my view the dynamic path to the future of first class messages is integrated with a first class object language.

In theory I like the idea of multi-staged programming however in practice "static compile time typed variables" constrain and limit the abilty of programs much like a straight jacket does. While you can program that way it is by it's very definition a more limited means of programming. When you program with "typed variables" you are clearly placing "contraints" and "limitations" upon the variables.

In a way constraints are a lot of what programming is about. Let's get serious though, let's not constrain our own movements in and about our software or our software designs so much that it and we can't breath.

Mixing these ideas leads most uninformed programmers down the path most trodden to brittle and rigidified typed variable software.

I worked on a team building the Audition language that was "multi-staged" in that it used typed and untyped variables, and early binding and late binding. Fortunately I didn't design the language but was just on the rescue team. It was a mess and as we debugged it we generalized more and more cases of typed variables, by untyping them, thus enabling cases not thought of by the original team members. It was not just a technicial battle but also a political battle within the teams that had great impact upon the technicial success outcome of the project. There would actually be heated discussions battles over which "type" a variable or parameter should be. Give me a break! There are much more important architectural decisions at stake than getting caught in that level of minute detail.

Imagine arguing over one of ten shapes of rebar to tie a column in a building together when any one will work equally well to keep the building standing. What a waste. Choose your battles.

It's like the locked -in-madness that the database people had at a very large fortune five hundred company that I worked at. The database person refused to take fifteen minutes and adjust the size of a database table from four to six characters! Yes, she wasted hours of meetings fighting against the change even though the table was brand new and wasn't even polulated with live data yet!

Premature lockdown of types or classes has the potential to harm projects. Plain and simple. Rigid human thinking creates brittle software. The vast majority of projects fail and I wonder how many fail as a result of the kind of rigid thinking imposed by typed variables and other brittleness spreading features of current languages.

What you think you know today about your design that has you making "variable constraint" decisions limits the current and future use cases of the software. Most software needs to keep growing and adapting to it's users changining requirements and building in "figid locked down" cases within the programs doesn't help any.

When you lock down your software too tightly you are saying that your users are not important and that their concerns for new use cases and flexibility is not important. You are saying that quality isn't as important as a rigid control of brittleness. Rather than removing brittleness and embracing flexiblity you are saying to your users that you will contstain them and limit them, and most important of all you are saying that you control them. Not what they want to hear. Not what the people paying the bills want to hear.

I suppose that's what some projects are about. All the power to you.

This is a view gained by hard experience. I strive for flexibility in software design. Typed variables limit flexibility? How can they do the opposite of what they are designed to do? They can't!

Look, you can design and write your software anyway you please. Choose typed or untyped variable based languages or hybred langauges. Do the best you can.

Since it's possible for a langauge to produce stunningly successful results without using typed variables you must conceed that typed variables are thus unnecessary for programming. That's a powerful statement. No doubt that they might still be usefull in some languages but typed variables are unnecessary for programming! The proof? Languages like Smalltalk, LISP, Self, Zoku and others.

If you object to this then you must have some compelling reasons that I really want to know since they must be really important and worth it. Please be specific, and as clear and as detailed as you can.

Note that I didn't deny that statically typed variables are systems that work. Eight bit computers work, yet they aren't used as the mainstream processors anymore. Sure they are used in specialized embedded chips all over the place (and sixteen bit chips even more).

Choosing the capabilities and specific features that go into a language are crucial for many reasons: expressive ease and power, access, conceptual clearity, simplicity or complexity of syntax and grammar, etc... One could create a language with everything in it but you'd end up with a binary Tower of Babel. Design matters.

One of the main reasons that The Cult of the Cryptic keeps supporting "typed variable languages" is that it's easier to interface and map the code to modern processors. Yet the capabilities of 32 bit and 64 bit chips now in and soon to be in consumers hands provide us the resources undreamt of years ago. Yet the vast majority of the computing industry clasps onto designs of the past that encouage rigid software designs. It's almost as if people are holding onto the older "iron metal" culture rather than adopting the lighter metals and plastics of modern products. Sure iron is still usefull but fewer and fewer products use it. Seen a new cast metal home or office phone lately? It's used where it's good: in products like fences, in frying pans (although less of those sell than before), etc...

In the vast majority of software products the end user is better off with the more advanced (and by defintion) flexible software designs based upon dynamic run-time late binding untyped variable decision making. Yes, this technology has been around for thirty years or more as well, and it's improved trememdously since then (compare the first Smalltalk system to the latest Self or commercial Smalltalk system). It's not the age of the technology, pens are still powerful instruments but most people use prefilled ink pens rather than an ink well. It's about adopting technologies that actually have an positive, measurable and desirable impact upon the software that is being built.

Look, I'm under no illusion that the vast majority will follow Java and C# to the ruin or marginal success of most projects. I'm under no illusion that the Cult of the Cryptic won't continue to exert it's influence upon the vast unsuspecting masses of less aware programmers.

In my vision of the future of sofware low level decisions like variable typing will become a thing not done at all. Most low level decisions will be done by automated software. In the future the vast majority of programs will be written by programs from specifications (of one sort or another) or declarative "designs". Details will be filled in by programs asking questions of databases and as time progresses less and less frequently by humans. Of course, if this future is to come to pass, the underlying systems need the most flexibility and adapability. Whatever the future holds dynamic systems are gaining ground.

Even clunky systems like C# from Microsoft are adopting closures that Smalltalk and LISP have had, well, essentially forever since the dawn of time of the Industry. What's driving C# forward? S#! Yes, S#, a variant of Smalltalk! The creator of S# has worked closely with Microsoft to ensure that their environment (.NET) has a place for Smalltalk. While we might not all like the .NET universe that Microsoft is spawning it will undoubtably have a profound impact upon a vast horde of programmers who will blindly follow it's linguistic dictates chaining them into "statically typed variable" bondage. Eventually they might find relief with chocies like S# and Sharp Smalltalk (yes there are two of them now with more on the way from other Smalltalk vendors - Cincom is porting one of it's three Smalltalk versions, ObjectStudio, to .NET). Call it "convergence" upon a "standard vm". Ick.

It's about straight jackets v.s. freedom. Choose. Express. Succeed.

The Zoku langauge is all about freedom. The freedom for software to be adaptive with ease. The freedom for users to break out of brittle shells created by programmers; the freedom for programmers and users to stretch their software to new limits and then beyond again, breath in new expressiveness that others have not thought of yet, wax poetic in a literate langauge - abandon cryptic codes and cults. Lean to do more with less.

No software is perfect. Zoku is not perfect. The point for Zoku isn't perfection but striving to be the best possible system I and others can make it. Let's have a taste.

Zoku: The First Langauge With Inline Syntax for Messages as Objects The original article driving this thread is about making methods first class objects. In Zoku messages are first class objects beyond the high standard that Smalltalk already sets for systems. In Zoku messages as objects can be accessed inline in the expressions in the language. In Zoku, as in Smalltalk, all statements are made up of messages being sent. All meta data is accessed using messages sends. Unlike Smalltalk, in Zoku you can access messages inline, without changing their normal expression, as objects before they are sent and you do so with messages! This is total first class meta level access to messages as objects and the message sending system as an object on a full equal footing with all other objects using the same message passing system and syntax that are used to write any other message sends to any object in the system. This is a recursive elegance and beauty of the Zoku language design. This design comes from flexibility inherent in timeless simplicity of the dynamic grammar. First class messages are taken to a whole new level creating a new and powerful language with full expressive power as an object system and as a distributed messaging system.

(Note: As far as I know no other languages incorporate an inline syntax for allowing messages to be first class objects.)

Dynamic systems rule the future of first class messaging and object languages. Zoku leads the pack.

Andris Birkmanis - Re: The Case for First Class Messages

5/24/2004; 6:27:56 AM (reads: 1627, responses: 0)

In Zoku messages are first class objects beyond the high standard that Smalltalk already sets for systems

I was wrong, you don't need MSP, you are after Actors of Hewitt... Nice model, I like its conceptual simplicity (not unlike that of Scheme, which is not coincidental).

But I tend to agree with other people - your posts are difficult to read :-( Have you tried other media, like Wiki? It would allow other people to easier comment on your points.

Daniel Yokomizo - Re: The Case for First Class Messages

5/24/2004; 10:01:26 AM (reads: 1563, responses: 0)


allDigits = ['0' .. '9'] ++ ['A' .. 'Z']

readNum base = foldl plus 0 
    where plus n = (+) (n * base) . digit
          digit = index allDigits 0
          index (x:xs) n c | c == x    = n
                           | otherwise = index xs (n + 1) c


showNum base = map (allDigits !!) . getDigits base


getDigits base n = snd $ until ((== 0) . fst) next (n, [])
    where until p f x | p x       = x  
                      | otherwise = until p f (f x)
          next (n, ds) = (div n base, mod n base : ds)


splitWith n [] = []
splitWith n xs = y : splitWith n ys
    where (y, ys) = splitAt n xs


format width f xs | length xs >= width = take width xs
                  | otherwise          = f (width - length xs) xs


left width x = format width (n xs -> (take n $ repeat x) ++ xs)


readFromOctets = map (chr . readNum 2) . splitWith 8 

writeToOctets =  concat . map (left 8 '0' . showNum 2 . ord)

There are two things about Haskell type system that make this code snippet much smaller than the Java equivalent:

1. Type inference: as you can see there's no type declarations. Each declaration and expression has its type inferred by the compiler and we don't need to write them down.
2. Parametric polymorphism: there are several higher-order functions in the code substituting explicit recursive functions (e.g. map, foldl) making some pieces of code one-liners. Also it's very easy to define our own higher-order functions (e.g. format, until) and use function composition to reduce the number of explicit lambdas.

The equivalent Java code would be much larger (and left as an exercise for the reader ;). The code could be greatly improved by using other standard higher-order functions, reducing size and increasing readability, but as it stands it illustrates how a better type system can reduce the "bloat" associated with statically types for all expressions and declarations. In this example the amount of type declarations is zero, all the types can be inferred and we don't need to write anything "just to make the compiler happy".

Frank Atanassow - Re: The Case for First Class Messages

5/24/2004; 11:17:18 AM (reads: 1543, responses: 2)

Peter, I apologize for the disparaging tone of my previous post, but not its content.

There is a difference between calling someone stupid and calling them ignorant (ill-informed). The first is an ad hominem attack, but the second is not. I admit to ignorance about many things, but not stupidity. If someone were to say that I am ignorant about a topic I take a deep interest in, then perhaps I would regard it as derogatory. But since you don't seem interested in static typing, I don't see why you should take it badly for me to point out your ignorance on the subject.

You do seem ignorant about typed languages; if you were not, you would have immediately anticipated the mention of ML-like languages and type inference. In addition, you've made many other remarks which strongly suggest that your experience with static typing is largely limited to the C family and its OO offshoots, none of which reflect the state of the art, even when the state of the art that existed at the time of their inception.

In addition, your repeated use of the phrase `typed variables' betrays some kind of tunnel vision; I'm not quite sure what is going on in your mind when you use these words, but I'm pretty sure it isn't right.

You talk about me `getting personal', but in fact I'm only criticizing what you have written. I'm not out to embarass you, but I do disagree with what (I think) you say, so I want you to say something which is either objectively verifiable or irrefutable. Instead, you've been enumerating anecdotes, dogma and fuzzy analogies and made fantastic unsupported claims like:

Peter: Smalltalk provides a bare minimum of capabilities needed to cross the "Wolfram Computational Equilivance threshold"

Indeed, it would be a small miracle if something as specific as Smalltalk somehow ended up being connected in a canonical way to something as abstract as Wolfram's research.

You wrote,

My point is that I've not seen any compelling arguments for the addition of typed variables to Smalltalk.

Well, I'm sorry to have to say this, but whether or not you've seen any compelling arguments is irrelevant. What matters is whether the literature exhibits any compelling arguments, and preferably ones which are not subjective but objective. (And why should I care about Smalltalk in particular?)

in addition, Peter, I have to say that I find the way you write about programming languages to be vague, imprecise and sometimes unintelligible. To me, this suggests that you are ignorant of the accepted (or `recieved', if you will :) terminology because you are ignorant of the literature. I think it also significantly contributes to muddled thinking on your part.

You will accuse me again of attacking you for saying so, so let me give an example:

Types do seem to make some sense in functional languages especially when you don't have objects to hold the type information. In no way am I saying that these systems are not "consistent" or don't provide benefits, they do work and provide value.

You also wrote,

If you took my statements to be "unequivocally" and "categorically absolute" statements about programming languages that is not the context or meaning that I intended.

Well, that is precisely the problem, Peter. Programming language theory is all about making statements as unequivocal as possible, and supporting them with evidence that is as objective as possible, preferably a mathematical proof.

The statements I've been making are in the context of languages such as Smalltalk, C++, Java, C#, etc. I've not been intending to make comments about functional languages.

First, as I recall, Smalltalk is a functional language.

Third, aren't you conflating `functional' with `typed'?

Look, if there is a compelling benefit to adding typed variables to a language like Smalltalk I'd gladly accept it to gain that benefit. What is the benefit?

(Again, why does everything turn into `typed variables' and `Smalltalk' with you?)

The benefit of types in a typed language is that it gives the programmer a tool he can exploit to write programs that are easier to reason about without reducing their flexibility. I know you argue that typing reduces flexibility; I'll show why this is incorrect in my article. If you will allow me a subjective argument, I would also say that typing gives a much-needed framework that allows programmers to think more carefully about their programs.

I'm not attempting to convert you.

But you should be! Because IMO the choice of static or dynamic typing is not a question of taste; static typing is provably more expressive. I've already shown by an embedding that static typing is at least as expressive as dynamic; in my article I plan to show why it's strictly more expressive.

You wrote: Typed code tends to be less generic, thus less reusable and as a result more code is required to express the same ideas.

I asked: 1. Less generic/reusable than what?

Type variable code is less generic/reusable than untyped variable code, of course.

(What is this obsession with variables?)

This does not answer my question; it only restates your original claim.

If we have a piece of code X in typed language L, you claim that X is less generic/reusable. Less generic than what? Describe the language L', and piece of code X' in language L', that you claim X is less generic than.

For example, you might answer: "Take L as the simply typed lambda-calculus and L' as the untyped lambda-calculus. Then a lambda-term X of L is `less generic' than X', where X' is obtained from X by erasing all the types in the term (i.e., X' is the `type erasure' of X). I claim X is less generic than X' because so-and-so...."

Yes, I'm comparing typed variable code with untyped variable code. They are comparable.

I know typed and untyped languages are comparable. I'm asking: what is the comparison that you are using?

[re: integers and reals]No that sense isn't what I've been talking about. The interpretation that you are suggesting has absolutely nothing to do with any of my arguments.

I think it does have something to do with it, which I prefer to reveal later. Please answer the question.

It's clear that in a language like Smalltalk the "type" information of an integer or a real number is shifted from the "type" of the variable to the object contained within the variable.

I'm not interested in Smalltalk, or Smalltalk integers or floats. My question concerns the set of real numbers and the set of integers; these are abstract mathematical entities which exist independently of Smalltalk.

However, since you bring it up:

In the case of Smalltalk an instance of an integer or a real (a "Float") "carries" it's own type information with it's class. The type information isn't lost, it's simply shifted from the variable to the object contained in the variable.

My claim will be that information is necessarily lost, because you cannot recover the type of the variable from the type of all the values it may hold, that is, your `shifting' operation is not an isomorphism.

Peter William Lount - Re: The Case for First Class Messages

5/24/2004; 12:13:18 PM (reads: 1540, responses: 0)

I've spend many years working with staticly typed and multi-staged programming systems. I'm not the ill-informed programmer that some once thought. I think outside the box. I'm sure that we can learn to bridge the gaps in our communication and ideas.

If you find my posts difficult to read please ask questions about the posts rather than stating your judgements about them and me and not asking your questions about the posts!

Yes I'm after conceptual simplicity and freedom from the unnecessary constraints that programmers all to often apply to software. I want to be able to adapt software ten to a hundred times faster and more reliably then we do today. Brittle technologies of the past or future are not going to get us there, in my view, flexibilty and dynamic systems will. It will take innovation, new potent object models with room for powerful and languages with fully distributed fisrt class messages and objects.

andrew cooke - Re: The Case for First Class Messages

5/24/2004; 2:30:17 PM (reads: 1505, responses: 0)

frank, couldn't you *try* to understand what he's saying? i read your deleting type info stuff and - assuming i understood it! - it seemed like that was what he was trying to say, and what i understood by what he wrote and what you, too, understood by what he wrote. so why can't you write "i'd frame your argument like this ... and my answer then would be ...".

that way, you get to show you are not only right, but also just. and we get free lessons.

Sjoerd Visscher - Re: The Case for First Class Messages

5/24/2004; 2:47:05 PM (reads: 1493, responses: 0)

Interesting. I'd like to see proof of that!

Peter William Lount - Re: The Case for First Class Messages

5/24/2004; 6:10:34 PM (reads: 1472, responses: 0)

In response to Frank. I accept your apology. I thank you for taking the time to write in detail. I appologize in advance for the length of this post and other posts. I feel that in order to give proper treatment and fully express my thoughts I need the space. If I could write with less words I would, however time doensn't permit that.

I have a high standard of doing my best to not say how someone is "being" by using words like "you are x". To me these words are unnecessarily personal attacks, especially in a written forum that could be around for many years. Their tone changes the discussion into a more aggressive tone.

Attacking the Person(argumentum ad hominem):
The person presenting an argument is attacked instead of the argument itself. This takes many forms. For example, the person's character, nationality or religion may be attacked. Alternatively, it may be pointed out that a person stands to gain from a favorable outcome. Or, finally, a person may be attacked by association, or by the company he keeps.

There are three major forms of Attacking the Person:
(1) ad hominem (abusive): instead of attacking an assertion, the argument attacks the person who made the assertion.
(2) ad hominem (circumstantial): instead of attacking an assertion the author points to the relationship between the person making the assertion and the person's circumstances.
(3) ad hominem (tu quoque): this form of attack on the person notes that a person does not practice what he preaches.

When you say "You do seem ignorant about typed languages" you are indicating something about my state of being rather than simply discussing my statements. That is a personal attack. I've attempted to avoid doing so in my writings to you and others in this comment forum. You don't see me doing assessments about your skills, capabilities, state of being, etc... in these posts.

It's a form of respect and I wish to raise the bar on this conversation. It forces one to write on topic and not about the other person, with whom I have NO personal knowledge, just a few written words in a comment forum. I'm not the topic here, nor are you, the topic here is something to do with First Class Messages as Objects and Typed Variables. Some keep bringing it back to "types" in general, but that isn't the main theme of what I writing about, thus why I've not addressed it. I consider those kinds of "peronal attack" statements personal for these reasons and request that you and other rise to the occasion and stop the ad hominem based statements. Oh by the way, it was another person's assessment that your comments were "ad hominem" based statements.

I'm committed to writing on topic. What are you committed to?

Untyped variables are at the core of why Smalltalk gets it's power. Obviously if you have knowledge that I don't I'd love to learn it. If being focused on a topic is "tunnel vision" then so be it. The reason that your "tunnel vision" comment is again a personal attack is that you fail to ask me what's going on in my mind, you leave it as a statement of fact or truth, when it's just your opinion. This form of comment do more than just criticize what I've written, the style includes your opinion about me and my state of being and mind, of which you don't have any substantial knowledge of.

Indeed, it would be a small miracle if something as specific as Smalltalk somehow ended up being connected in a canonical way to something as abstract as Wolfram's research.
No need for a miracle just simple observation does the job. It's an attempt to write an analogy or metaphor and it's a rough draft. I admitted already that it wasn't clear and that I would address that in the next version of the article. It links the ideas of Wolfram into computer languages where they do apply. So it's far fetched that a generative grammar could be considered related to Wolfram's work as a cellular automata? He has simple ones in his book "A New Kind of Science". I simple saw a connection and have begun writing about it. I'll transform your statement into a question and attempt to clarify it further at a subsequent article on that topic. For now look please see where Wolfram says "The multiway systems that I discuss are similar to so-called generative grammars in the theory of formal languages." and "Formal Languages and Multiway systems".

What matters is whether the literature exhibits any compelling arguments, and preferably ones which are not subjective but objective.
This is why I'm here asking questions. Where are the literature references?

I have to say that I find the way you write about programming languages to be vague, imprecise and sometimes unintelligible
Then ask questions or provide relevant references as requested. Writing about "vague" ideas of "style" and how it affects programming languages and the writing of programs isn't easy. However one aspect of this, "untyped variables", seems very straight forward to think and reason about.

To me, this suggests that you are ignorant of the accepted (or `received, if you will :) terminology because you are ignorant of the literature. I think it also significantly contributes to muddled thinking on your part.
Again getting personal with "because you are" and "contributes to muddled thinking on your part". Writing without getting personal is a difficult style to embrace, however it's worth it. I'm learning how important it is in public writing.

There is a LOT of literature out there. Computer science is a vast field. No one person can know it all. I most certainly wouldn't presume to. Please define the terminology that you are using or wish me to use so that everyone is on the same page. If you have questions about my terminology please ask.

What does it mean to say `types make sense' in a functional language? How do objects `hold type information'? (How are objects even relevant?) And the reason you quote the word `consistent' is surely because you don't understand what it means.
Another person attack with "surely because you don't understand what it means". These kinds of statements seem to be designed to "bait". I'm not "trolling". I'm here for serious discussions, are you?. I ask you to raise your standard by refraining from making those kinds of personal statements. "Cut the chatter, ... steady on target Red Five."

If the "untyped variables" in a language don't hold the type information then where is it held? In Smalltalk it's becomes part of the class information of the object contained within the variable. Where else can it go? Onto the variable or it travels with the variable's contents. The other obvious place it could go would be the "code" that the compiler generates about the variable and it's contents. Where else can the information go?

Object are relevant since they are one of the main choices where type information goes. The reason that I say that "types make sense" in functional languages is types make sense in any non-object language that don't have an "object" to put the type information onto.

Frank wrote first:"I'm sure you are a fine and accomplished programmer, but that alone does not equip you with the tools one needs to reason unequivocally about programming languages.
Peter wrote in response:
If you took my statements to be "unequivocally" and "categorically absolute" statements about programming languages that is not the context or meaning that I intended.
Frank wrote in response: Well, that is precisely the problem, Peter. Programming language theory is all about making statements as unequivocal as possible, and supporting them with evidence that is as objective as possible, preferably a mathematical proof.

I am here to learn what I can. While mathematical proofs are interesting in theory that isn't my purpose or intent and I stated so. I'll leave that to others who have taken that on as their goal. Besides this is a "comment" forum not a PhD thesis.

I take exception with your statement that I'm not equiped to make "unequivocal" statements about programming languages.

I'm here because I am interested in exploring the objective and evidence based aspects of language design that are related to the language that I'm developing. If you feel that I should have knowledge that would be useful then please present those aspects to me without resorting to personal attacks.

If you feel that there is evidence that I'm missing please present it. I'm open to learning. My main questions are open to answers, they've not yet been answered. I'm open to contributions to my knowledge, skills, and experience that would positively impact the design of Zoku.

Frank wrote:First, as I recall, Smalltalk is a functional language.
Mainstream Influences of Functional Languages has this to say about it: Because the functional ideas inherent in Lisp were not fully developed at the time Smalltalk was created, the conceptual emphasis in Smalltalk was on object-orientation, derived from SimulaLanguage. If Smalltalk had been able to draw from Scheme instead of Lisp, there's a strong chance that it would have had a more functional bent, which might have affected the languages which were influenced by Smalltalk.
Smalltalk is generally considered an "Object oriented language". While it might have some functional characteristics it seems to be the second or third object language and the first "pure" object language in it's category of languages. In almost all the computer science text books that I have or have seen on the topic of computer languages it's Smalltalk is considered object oriented and isn't classified as functional.

Second, why do you think it is easier to make unequivocal statements about functional languages? And, don't you think the ability to make such statements would be a desirable property of a programming language, seeing as how it would allow you to make unequivocal statements about programs written in that language?
I am not intending to make unequivocal statements. I'm simply making observations derived from my extensive experience studying and writing software for twenty five years.

Third, aren't you conflating `functional' with `typed'?
If you interpreted it that way that isn't what I intended. I most certainly am not "conflating" them.

Peter wrote: Look, if there is a compelling benefit to adding typed variables to a language like Smalltalk I'd gladly accept it to gain that benefit. What is the benefit?
Frank wrote: (Again, why does everything turn into `typed variables' and `Smalltalk' with you?)
Because that is the point I'm focused on. In my view it's the most relevant aspect to productivity and the kind of language that I'm creating. I'm not attempting to make any statements about any other aspects of "types" other than how they are transferred to classes. I'm interested in other opinions about how theses aspects occur and in any thing else that might be relevant such as how "polymorphism" takes on it's role in methods in untyped variable based languages. Any thoughts anyone?

Frank wrote: The benefit of types in a typed language is that it gives the programmer a tool he can exploit to write programs that are easier to reason about without reducing their flexibility. I know you argue that typing reduces flexibility; I'll show why this is incorrect in my article. If you will allow me a subjective argument, I would also say that typing gives a much-needed framework that allows programmers to think more carefully about their programs.
I await your article. Please send me a link when you've completed it. peter@smalltalk.org.

How is it possible that a computer language can have untyped variables and still produce incredible results? Since this is not just a possibility but a reality this means that there is room in the, lets say, spectrum of languages for a class of languages that have their types on the objects rather than the variables. Any theory of types must explain the existance of Smalltalk, don't you think?

Untyped variables make a significant difference in a language. It is this difference that I've been writing and asking about.

Peter wrote: I'm not attempting to convert you.
Frank wrote: But you should be! Because IMO the choice of static or dynamic typing is not a question of taste; static typing is provably more expressive. I've already shown by an embedding that static typing is at least as expressive as dynamic; in my article I plan to show why it's strictly more expressive.
I look forward to your article. Given your statements above, how can a language like Smalltalk that has no typed variables - but has type info embedded in it's classes - be as expressive as any language that utilizes typed variables? Can it be made simplier and stay expressive? Yes, a little! As I found out with Zoku.

Thank you for sharing your opinion of what you think I should be doing. It's simply not my purpose to convert you.

Contrary to popular myth there isn't one single point of view that sees objective reality.

Remember the story of the six blind men touching an elephant? They all report a very different point of view or experience depending where they are touching: a leg, the tail, the trunk, the tusks, the ears, the body. Yet they are all reporting objectively about reality. Taking this further: A sighted person standing at a short distance away would see the entire elephant. A biologist might take a DNA sample from many parts of the animal and see a genome. A poacher will see cash. A cook might see dinner large enough to feed a tribe. A framer might see an easy way to plow his fields and clear trails through the forest. A circus performer sees a partner for tricks.

My energy is going into creating a language, Zoku which is a variant of Smalltlak, and doing related research in that regard. I've been making some important distinctions and observations from this thread of discussions that are relevant to my work in creating Zoku.

Peter wrote: Type variable code is less generic/reusable than untyped variable code, of course.
Frank wrote: (What is this obsession with variables?) This does not answer my question; it only restates your original claim.
I'm focused on variables since typed variables are usually used to create "static compile time" code lock down that limits the "expressiveness" of the code in question. Adding types to Smalltalk creates unnecessary baggage in the name of "type safety". Not having to "type variables" actually changes the nature of the code that is written. It becomes more general and reusable. See the Alan Kay quote in my recent post.

In some notorious languages the contents of an array must be of a certain type. In Smalltalk you can place any "type" or "instance of a class" into an array (or collection). The type requirements that "limit" what "type" goes into the collection, in the typed array, limit the generality of the code being written. That's what it's designed to do! It's designed to protect the array from the wrong type going into it. And it works. No argument there. You just don't need it in practice!

The problem is that programmers think that their code needs to be "statically" locked down by declaring the "type" of valid objects. Granted sometimes you know that you'll never not have an integer, but more often than not - in my experience writing programs for twenty five years - I come back wanting to use code for some other purpose only to find it has been overly constrained by the types chosen from an earlier view of the system. Systems grow and evolve breaking the types in a system. Requirements and use cases change, often in unpredictable ways. While it's possible to represent these changes after the fact can you really predict all the features and capabilities that will be added to your programs in the future? More often than not our plans for future features change.

Mathematical statements while interesting and useful are not required to understand the statements that I've made on this topic.

Peter wrote: Yes, I'm comparing typed variable code with untyped variable code. They are comparable.
Frank wrote: I know typed and untyped languages are comparable. I'm asking: what is the comparison that you are using?
Ah, good question. Actually I'm comparing typed with untyped variables, and then the languages that use them.

The comparison is simple. Let me explain it a couple of ways.

If I type a variable such that it can only take integers this, by design, limits it to integers. Any attempt to put in another kind of object should either result in a type conversion or an error. By design it is more limited! Later on the object's method is invoked in a use case that passes in a "real" number but he code converts it to an integer creating a problem. The original design only anticipated an integer and thus only works with an integer. An untyped version of this method would not care if it's an integer or a real, if it did it can convert on the fly dynamically. I've come across this kind of scenario with integers and reals in actual programming experience. The original programmer "hard coded" the type where the code would be more general and thus more reusable with the type left "generic", i.e. any number would do fine. Usually the types involved are more often than not "domain objects" rather than the simplier types of value objects.

I think that you'd agree that "hard coding" constants inline isn't good programming style. You can do it but it's not considered good style. It's better style to put "constants" in an appropriate place where they are more likely to be seen and thus easier to change. Better yet, move them out of the program and into a configuration file or preferences panel, or eliminate them altogether if possible by going to a dynamic system.

In much the same way "hard coding" variables with a "type" locks programs down in ways that many consider to be bad style. This is especially true of languages that support "static types" on variables.

A fun example is: I could wear a straight jacket to prevent my arms from swinging around and hitting things or I can simply manage my arms myself. I prefer to dynamically manage types rather than be confined by the "compiler" which belives it knows best.

Look if a system under the covers "infers" the types of variables that's a cool implementation and can benefit the performance of the language and can offer the programmer other features. For example, the Self language caches about five or so of the types coming into a method to "customize" different versions of the method specific to the "type" or really the "prototype" of the object coming in. It will for example, compile a separate integer, real or fractional (as in 1/2 kind of fraction) version of the method to maximize performance for the user. It does this without forcing the programmer to be specific about the type of the variable slots. This technology helps Self get up to 50% the speed of C. Not bad considering all the features that the language is providing without using "typed variables". This is a useful and powerful technique and I've got nothing but respect for it.

For an attempt at objective "code size" metrics please see the Chimu article quotes in my earlier post regarding specific metrics comparing Smalltalk with Java.

Some of the comparisons are subjective and thus can't be encoded in mathematics in an objective manner. How do you really objectify "more general than"? Much of what we are discussing is a matter of style which I, and many others, assert matters a lot in programming languages and their design.

Regarding integers and reals Peter wrote: No that sense isn't what I've been talking about. The interpretation that you are suggesting has absolutely nothing to do with any of my arguments.
Frank writes: I think it does have something to do with it, which I prefer to reveal later. Please answer the question.
Ah, ok so you don't want to answer my question and then you "point blank ask" me to answer yours? Hmmm... where are you taking your argument? I said that "types" matter in Smalltalk in the sense that the classification hierarchy encodes "type" information by distinguishing between types; for example, the various kinds of "Magnitudes" such as Dates, Times and Numbers. Numbers are further distinguished into various sub classes (or types) of Integer, Small and LargeInteger varieties and of course, Floats (for reals). Not so common in languages is the distinction of Fractions that enable fractional numbers to stay fractions. i.e. 1/4 remains "1/4" unless "converted" to another "class or type" of number as in "(1/4) asFloat" which converts it to "0.25" as a Float object instance.

I'm not interested in Smalltalk, or Smalltalk integers or floats.
Ok. What are you interested in? Are you at all interested in the fact that languages can do quite well without "statically typed variables"?

My question concerns the set of real numbers and the set of integers; these are abstract mathematical entities which exist independently of Smalltalk.
I believe I answered your question concerning this. Let me reiterate.

Frank wrote: "... Do you believe this is a tidier approach to mathematics and, based on this, would you recommend we abandon the reals and work exclusively with the integers?"
No, that would be silly since it is a needless understatement to say that the distinction between "integers" and "reals" has been and will continue to be useful in mathematics and programming languages.

Rather than asking me a question in this the form you did please simply make your statements regarding "integers" and "reals". Let's keep an open agenda and pick up the pace and move the discussion along with some velocity.

Peter wrote: In the case of Smalltalk an instance of an integer or a real (a "Float") "carries" it's own type information with it's class. The type information isn't lost, it's simply shifted from the variable to the object contained in the variable.
Frank wrote: My claim will be that information is necessarily lost, because you cannot recover the type of the variable from the type of all the values it may hold, that is, your `shifting' operation is not an isomorphism.
Ok, let's look at the definition of isomorphism (using Google) to see if we are on the same page. I take it that you mean that "isomorphism is a one-to-one correspondence between two objects which preserves all their mathematical structure" or "A one-to-one correspondence between a perceived object and its internal representation"? Wolfram expands on the definition as does Wikipedia.

Ok, you are asserting that information is lost when a language like Smalltalk avoids typed variables? This may be the case, however sufficient information is transferred to the objects and distinguished by their classes to make the language the first pure object oriented language and one that's lasted the test of time.

I'm very interested in what information is lost. Please expand upon that further in detail if you would. If there is any value to that information then there might be a case for "typed variables" - assuming typed variables are required for it to be present. If you reread my posts you'll see that this is one of the areas where I'm looking for "compelling benefits".

As touched on earlier, a "type inference" or "type observation" capability in the virtual machine or run-time of a language that "records" the types of all objects that are placed in it's (untyped) "variable" slots is able to "determine" the set of all current and past "types" during the monitoring time period. It seems that you can for "all practical purposes" recover the type information if it's needed for some reason. It's just that you won't know the "type" that the programmer might have wanted to "lock it down to" since it's not been specified.

Frank, once again I thank you for your comments that there were "appropriately non-personal" and on topic. As for your other comments directed at me personally I don't appreciate them and would hope that you take up the challenge of a higher standard of writing in public. I await your "article" with interest.

Peter William Lount - Re: The Case for First Class Messages

5/24/2004; 8:15:29 PM (reads: 1453, responses: 4)

Hold your breath I actually wrote a short posting!

Daniel wrote: Consider the following Haskell code I wrote to show Haskell to someone.

This is an excellent example and is very interesting indeed. I take it by "type inference" you are referring to the system figuring out the types and applying those types to the "untyped variables" thus making them "typed"! Does this happen at run-time or compile-time? If at compile time do the types become "static"?

The parametric polymorhism is also interesting. Again are the variables then "co-typed" with two or more "possible types" or is the type information transferred to objects contained within the variables? If "co-typed" that would imply "static compile time" type inference and appropriate code generation.

I can see that the Java code would be larger.

Haskell definitely has a lot to offer in terms of "losing the requirement" to manually supply the "type information" for variables. How about dynamic objects. I read that Haskel supports objects, how does it fit into things and my comments?

Dave Herman - Re: The Case for First Class Messages

5/25/2004; 12:36:08 AM (reads: 1414, responses: 0)

[In the absence of a FAQ about static/dynamic type wars...]

We users of untyped languages such as Scheme, Smalltalk, Python, etc., which lack (non-trivial) type systems and in which values carry runtime tags to distinguish them, often confuse the distinction between types and tags. In a community of programming language theorists/enthusiasts, it's important to understand the accepted definition of core terminology. Types are essentially static assertions about the possible dynamic behavior of program terms. Note that runtime tags do not fit this definition at all!

Now, those familiar with C-like languages are accustomed to types being something associated with variables. However, they are much more than that. Let me attempt a quick introduction.

The definition of a PL generally begins with an inductive definition of program terms, i.e., a BNF grammar for the abstract syntax of the language. From here, we define a type system for a language by a set of logical rules that describe the possible ways to associate types with all terms in a language, i.e., for every possible piece of abstract syntax. This is most intuitive when it comes to value-expressions, such as:

(x == y) ? 17 : 42

In that example, we expect the if-subclause to be of boolean type, the subsequent then- and else-subclauses to be of equivalent types (in this case, int), and the entire expression to have the type of the latter two clauses (again, int). But we can give types to imperative terms as well. For example:

{
  int x;
  x = f();
}

We use the void type for imperative terms that don't evaluate to useful values. We can then type the statement x = f(); by matching the type of x, which is known statically, to the type of f(), which is also known statically, and assigning the entire term type void. For a sequence of statements, we use the type of the last statement as the type of the entire sequence. So in the above example, every subterm has a type, even though it appears on the surface like the only type involved is the one associated with the declaration of variable x.

Now if you thought writing down the type for type variables was bad, try writing down explicitly the type of every single subterm of en entire program! Well, to start with, some type annotations are clearly redundant; for example, once you've declared the type of variable x, it's simple for a type checker to determine the type of subsequent references to x. But it turns out that for extremely powerful type systems (far more powerful than in C/C++/Java/C#/etc.), there are algorithms for figuring out the type of every subterm of a program without the programmer writing a single explicit type annotation. (This is what Daniel is demonstrating in his Haskell example.) For more about this, look up the Hindley-Milner type inference algorithm, pioneered by the language ML.

Everything we're talking about here happens statically. Types are assertions (theorems) about programs that are made and checked statically.

Parametric polymorphism allows you to quantify over all possible types for a particular procedure argument, not just a couple possible types. Note that it is not coherent to say that type information is transferred to runtime objects; again, types are computed statically.

I won't take the space to explain parametric polymorphism, but suffice it to say that modern type systems (ML and family, Haskell, etc.) would be useless without it. (And Java is becoming much more tolerable with the addition of parametric polymorphism.)

There is so much more to say about types, but I'm afraid Frank's basic point stands: you can't condemn the entire practice of types simply because you don't like the type system of a particular, impoverished language. Before you can make a claim about the power of type systems, you need to be well-versed in the field. (Note that I don't include myself in such a category.) I recommend Benjamin Pierce's text as a good starting point.

Ehud Lamm - Re: The Case for First Class Messages

5/25/2004; 3:28:06 AM (reads: 1386, responses: 0)

Does this happen at run-time or compile-time? If at compile time do the types become "static"?

The varaibles are typed, they are just not explicitly typed... Everything happens before run time (at "compile time" if you want), and type assignments are static. Haskell is strongly typed (much as I dislike the term).

Ehud Lamm - Re: The Case for First Class Messages

5/25/2004; 3:44:44 AM (reads: 1385, responses: 0)

I've already shown by an embedding that static typing is at least as expressive as dynamic; in my article I plan to show why it's strictly more expressive.

Frank, since we are essentially on the same side of the typing debate, let me try to suggest another way to approach problem, which may help explain why these arguments are never ending.

You are discussing language expressivness, whilr many in the dynamic typing world are not thinking about the languages per se but rather about programming style. Recall Dan Friedman's use of the term style, as in Object Oriented Style. We are talking about Typed Style Programming. Another term, once used, was Typeful Programming.

Naturally, the fact that dynamic typing can be easily embedded in a static typing framework by using universal types doesn't convince those thinking about Typed Style. They are thinking about explicitly specifying types, so this is demonstration is a non sequitur since it bypasses the issue they are concrened about.

I can anticipate your response: they are wrong about Typed Style just as they are wrong about typed language expressivness. Of course they are. But it might be easier to explain the different issues involved by separating the discussion of statically typed languages vs. others from the discussion of the Typed Style vs. others.

The two perspectives reinforce one another, of course.

Frank Atanassow - Re: The Case for First Class Messages

5/25/2004; 8:10:31 AM (reads: 1367, responses: 1)

I don't have time right now to respond to everyone, especially Peter's latest novel (not that I'm not extremely guilty of being long-winded myself! :), and in fact I may not respond for a while since I'll be away for about a week here, but I'll try take a look at it tonight.

For the moment, let me just reply to this:

andrew: frank, couldn't you *try* to understand what he's saying? i read your deleting type info stuff and - assuming i understood it! - it seemed like that was what he was trying to say, and what i understood by what he wrote and what you, too, understood by what he wrote. so why can't you write "i'd frame your argument like this ... and my answer then would be ...".

I am trying to understand him, but I am also trying to elicit from him a precise and unambiguous response for these reasons:

First, because I think a lot of this static-dynamic debate is founded on specious, implicit and vague assumptions which no one ever voices, and it's worth making them explicit. Peter didn't make it explicit, so I had to, and I did, right?
Second, because these assumptions are implicit and vague, people frequently back out of their claims without appearing to do so. That's part of the reason these discussions keep going around in circles and never get anywhere. I wanted Peter himself to make an unambiguous claim and then show that it was false (or, actually, at worst only unfair or pointless).
Third, there is a style of rhetoric (think Plato) in which a teacher asks questions of a student in order to get them to think more deeply about the issues of the debate, and that is what I was pursuing. Yes, it is a bit patronizing of me to put myself in the position of teacher and Peter the student, but then, I am a teacher and a qualified expert on these matters.
Fourth, as a practical matter, because I plan to write up my argument in an article anyway, I didn't want to duplicate my efforts and write it all out here, so I thought it would be less work for me to get Peter to do the talking.

I hope that clarifies matters.

Me: My claim will be that information is necessarily lost, because you cannot recover the type of the variable from the type of all the values it may hold, that is, your `shifting' operation is not an isomorphism.

Sjoerd: Interesting. I'd like to see proof of that!

It will appear in my article.

Daniel Yokomizo - Re: The Case for First Class Messages

5/25/2004; 8:14:21 AM (reads: 1359, responses: 1)

I take it by "type inference" you are referring to the system figuring out the types and applying those types to the "untyped variables" thus making them "typed"! Does this happen at run-time or compile-time? If at compile time do the types become "static"?

In Haskell every expression has a static type. The compiler tries to figure out which is a type of some declaration by looking at its implementation. Lets see how the compiler infers the type of some Haskell code:


Partition takes a predicate and splits a list in two based on it

partition p xs = foldr select ([],[]) xs
    where select x (ts,fs) | p x       = (x:ts,fs)
                           | otherwise = (ts,x:fs)


The compiler already knows the type of (,), (:) and foldr, if it 
didn't know it would figure it out first.


(,) :: a -> b -> (a,b)
Takes two values and return a tuple with them.


(:) :: a -> [a] -> [a]    
Takes an element a list of elements with the same type and return 
another list with the element as its head and the list as its tail.


foldr :: (a -> b -> b) -> b -> [a] -> b
Takes a binary function, an initial value, a list and gives a single 
value. Here is a visual explanation of how it works:
The list [1,2,3,4] is actually 1 : (2 : (3 : (4 : []))), where [] is 
the empty list. Basically a linked list. The foldr operation is 
equivalent to replacing the [] at the end with the given initial 
value and the (:) with the given binary function.


foldr (+) 0 [1,2,3,4]                     -- is equivalent to
foldr (+) 0 (1 : (2 : (3 : (4 : []))))    -- is equivalent to
(1 + (2 + (3 + (4 + 0))))


Basically "foldr select ([],[]) xs" tells the compiler that xs is a 
list of something, ([],[]) is a pair of empty lists and select is a 
binary function taking an element of the first list and a pair of lists:
xs :: [a]
-- a list of a type "a", which is a type variable
([],[]) :: ([b],[c])
 -- a pair of lists of types "b" and "c"
select :: a -> ([b],[c]) -> ([b],[c])
-- according to the definition of "foldr"


Now it needs to verify the select implementation and see if it 
matches with the inferred usage. If there's an error the compiler 
will point the mismatched expression.


select x (ts,fs) | p x       = (x:ts,fs)
                 | otherwise = (ts,x:fs)


"|" are guards, it's equivalent to writing:


select x (ts,fs) = if p x
                       then (x:ts,fs)
                       else (ts,x:fs)


The compiler sees the bindings of select "x" and "(ts, fs)" and 
assigns them the types "x :: d", "ts :: e", "fs :: f". It sees the 
application "p x" and it knows that "p" must be a function, as it's 
used in a boolean context (i.e. the guard) it knows the result must 
be a boolean, so it assings it the type "p :: d -> Bool". Then it 
checks the branches of the if and, knowing the definition of (,) and 
(:) knows that the types "e" and "f" are equal to the type "[d]" 
(i.e. it unifies the type variables). The result of "select" has type 
"([d],[d])", so the function has type "select :: d -> ([d], [d]) -> ([d], [d])".


Now the compiler combines the two pieces of information it has to find 
the type of this expression: "foldr select ([],[]) xs". It unifies the 
type "d" with "a", "b" and "c", giving the following types:


p :: a -> Bool
xs :: [a]
([],[]) :: ([a],[a])
select :: a -> ([a],[a]) -> ([a],[a])


The declaration is them typed as:


partition :: (a -> Bool) -> [a] -> ([a], [a])

Partition is a polymorphic function, as it works with lists of any 
type (i.e. we have a type variable "a", not a concrete type). This 
kind of polymorphism is know as parametric polymorphism.

The type inference process is entirely static and usually* you don't need to specify types of expressions or declarations for purposes other than documentation.

Haskell definitely has a lot to offer in terms of "losing the requirement" to manually supply the "type information" for variables. How about dynamic objects. I read that Haskel supports objects, how does it fit into things and my comments?

Haskell don't have a primitive object encoding system defined so we have to make our own if we want to use objects. As I'm a newbie in Haskell I don't have enough knowledge to answer it (I think I know how to do it but I don't know how to explain it yet (i.e. I don't grok it)). As other people usually say when I'm using Haskell I don't feel a need to use objects as often as I have when I'm using an OO language (e.g. Java, Smalltalk), because the language features reduce the desire to solve problems using subtyping. AFAIK using objects in Haskell isn't as elegant as using objects in Smalltalk and I don't think that it would fit nice in the language (because the language was designed without it).

* In some cases you need to write some type declarations to help the compiler, but they only happen when you use more advanced feature sof the type system (e.g. type classes).

andrew cooke - Re: The Case for First Class Messages

5/25/2004; 8:34:37 AM (reads: 1354, responses: 0)

(i assumed the reference to objects was connected to type classes).

Chris Rathman - Re: The Case for First Class Messages

5/25/2004; 8:35:51 AM (reads: 1347, responses: 0)

A whilst ago, I posted a message on LtU that delved into OO Shapes in Mercury and Haskell. The Haskell example is not too bad if you use existential types to mimic the subclassing.

Not that this is how you'd really want to use Haskell....

Frank Atanassow - Re: The Case for First Class Messages

5/25/2004; 10:37:38 AM (reads: 1350, responses: 0)

To the list of reasons above, I should also add that my argument for static typing depends critically on exactly how one compares untyped and typed programs. That is, when you say, "untyped language X is better/worse/cooler/dumber/neater than typed language Y", you have in mind pairs of programs (P,Q) where P is in X and Q is in Y. Usually people assume there is a function which takes Q to P, the `untyped version of Q', but is left implicit that `untyped version of Q' means `Q is the type-erasure of P'.

In a nutshell, my argument will be that this is an intrinisically biased comparison, because it erases the type information which, after all, is the whole point of static typing. What you are left with after this erasure is only the dynamic semantics, that is, the `behavior' of the program. Since the typeable lambda-terms are a subset of the untyped ones, it appears that you lose some ways of expressing dynamic behavior. So you can say that this sort of comparison presumes (a priori) that the only important thing about a program is how it behaves at run-time.

My counterargument will be that a fairer comparison is one that rather embeds the untyped language in the typed language (using a universal type), since that preserves all the dynamic and static information about both languages. When you do this, you exhibit the untyped language as a subuniverse of the typed language, and exhibit static typing as letting you create many little customized subuniverses, one for each type. In contrast with type-erasure translation, you can say that this comparison acknowledges that static information may be important.

What this has to do with integers and reals is the following. If I want to compare reals and integers, I can do it by mapping the reals into the integers, or the integers into the reals. It is easy to see that, because of cardinality, the former will necessarily lose information. Based on such a comparison, no reasonable person would then conclude that the integers are as generic/reusable/whatever as the reals. A fairer way to do it is to map the integers into the reals, and then you see that there are some equations which have real solutions but not integer solutions. This is fair because the continuum is big enough to accomodate the integers and it does not presume a priori that the information which is lost when trying to embed the reals in the integers is useless.

BTW, just to be clear, this is not the whole argument. Given two sets X and Y, showing that X is embedded in (mapped one-to-one to a proper subset of) Y does not prove that Y is bigger than X. That's only true if X and Y are finite. If X and Y are infinite, then you can embed X in Y and Y in X and X and Y can still have the same cardinality and be isomorphic. For example, the naturals are isomorphic to the integers. (How? Hint: Consider two's complement.) So saying that the typeable lambda-terms are a proper subset of all the untyped lambda-terms doesn't prove anything anymore than does saying that the untyped lambda-terms are embedded in the typed lambda-terms (via the universal type embedding---I hate this terminology; let's call it Scott-reflexive model).

Avi Bryant - Re: The Case for First Class Messages

5/25/2004; 2:03:10 PM (reads: 1310, responses: 0)

So you can say that this sort of comparison presumes (a priori) that the only important thing about a program is how it behaves at run-time.

That's a fair critique. The flip side is that comparisons made by advocates of static typing usually ignore the possibility that a program might *change* at run-time. It wouldn't be much of a stretch to say that Smalltalk is a program that has been continuously running for over 20 years. The only assumptions that you could have made "before run-time" would have amounted to a model of a Turing machine (or, at any rate, of Smalltalk's virtual machine) - anything stronger could and probably would have been invalidated during the run.

Unless you consider that possibility, your comparisons are going to be no fairer than type-erasure, and of no interest or persuasive power for users of languages such as Smalltalk and Lisp.

Sjoerd Visscher - Re: The Case for First Class Messages

5/25/2004; 2:40:55 PM (reads: 1301, responses: 0)

Not that this is how you'd really want to use Haskell....

What? You don't want to perform some action on a list with different kind of things? I do that all the time!

Even this specific example, drawing a list of different kind of shaped, is not that far fetched. How should one do that in Haskell then?

Andrew Waddington - Re: The Case for First Class Messages

5/25/2004; 3:02:28 PM (reads: 1312, responses: 0)

Having read the *entire* blog I suppose I can comment.

First off I think that there is a pretty large apples & oranges problem here. I'll get to it later on...

Frank seems to have behavioural types confused with signatures or roles. Signatures being concrete method signatures, roles being incomplete method definitions. Can I write or compile,

string getName()true string name = getName(); or setName( getName() )

let's divide name by twelve, if the computer hangs, we can't; if the user knows who I am, its all good.

Earlier posts on encapsulation: As a writing construct providing discrete abstraction ok, as a data collation construct providing normalisation, ok.

As a programming philosophy in and of itself? garbage!

Back to the apples and oranges. There is no doubt that Mr. Lount has exceptional experience, and as evident from his posts, is a verifiable authority on the use of programming languages in the real-world. If he says untyped is faster and easier, so it is.

Frank, however, seems to be the only poster that understands that a functional computer program is inherently typed and cannot exist otherwise, I'm sure he sees untyped as being equal to randomly typed (Can a compiler *really* trust the programmer?). Lets go back and see if we can divide name by twelve, shall we?

From the perspective of the computer programmer, a computer language is an MMI, a man-machine interface. Forget not, that your computer program is converted into electrical impulses, etc. etc. etc...

Even in a massive 14,000 class program, the program is eventually reduced into a typed and discrete set of __(important word coming up)__ instructions. Generally computers don't read strings by accessing memory one byte at a time until the processor halts.

I'm sure that from Frank's (correct) perspective writing **computer instructions** that have undefined or unspecified results is garbage. However, as Mr. Lount has demonstrated, writing **programming statements** that contain 60% unnecessary information is a waste of time.

Both right.

Clearly, we cannot escape typing, no computer, processor, nor compiler could function without; type-inference (untyping) is just typing via the compiler, preprocessor, or whatever. Keeping with "first class functions", you're just saying that instead of some symbolic reference being a data reference, its an instruction reference, no biggie.

As a programmer, writing program statements, I will probably want some specified and discrete set of results from a given "1st class reference". That means a type (specified and discrete set of resultant instructions and data access). So we introduce patterns.

It seems useful here to implement a notion of flexible symbols that demonstrate constrained (discrete) activity (I don't like the word behaviour here, its wrong). Or, more specifically, flexible symbols that demonstrate or indicate activity that occurs within some bounded sets of the expected. This can be explained to Frank as a symbol that can be represented by a finite conjunction of multiple sets of instructions; and it can be explained to Mr. Lount as a symbol that is useful as a programming statement resulting in predictable activities within a programme.

I'll segue a short to dissolve you of a current notion of 'behaviour'. Some programming statements have a behaviour, this is false. Programming statements have activity. Defining a method or function does not provide 'behaviour', it conducts some *action* (normally on some data). You could say that if there were some switch clause that elected on some of several different activities based on some predicate, well then perhaps you have behaviour, but it is a misnomer and misleading.

Most english speakers have a grossly data-centric programming perspective, partly due to natural language, and partly due to chip design, history, and the dreaded memory-access violation. We can type data references in any ol way we want to, we can relate method signatures (and call it, erroneously a type) to the type of a result or parameters, or both; what we can't do is differentiate between a "1st class function" like this,

functionA( void ) { System.crash( ignonimously, preferably_withSparks, smokeToo! ); }

and

functionA( void ) { x = x; }

So, how to introduce a programming symbol that has defined, or at least bounded (predictable) activity?

resultWhoCares someMethodNameDontCareEither( ParameterType careLess ) {

init() addStuff() printResult()

}

To say that the method name is the behaviour is off, as is ascribing it to the result type or the parameter type. The behaviour is that the method signature ( defined symbol ), does some init thing, then some addStuff thing, and then some printResult thing. And in my limited experience, I haven't seen any programming construct that will allow me to predict that programming symbol A will access a file and that programming symbol B will not...

In the data-centric model we use, we can restrain the programmer into identifying the inter-relation between programme components, in Java, external references are introduced using an import statement, though Ada is much nicer. We can analyse the introduced components and their dependencies, and eventually, determine if at any point we had data-access to a file, a disk io component or such. We use these names to indicate _to the programmer_ that such and such is going to cause the disk to spin up.. . How so for the compiler, debugger, or the chipset? The chipset knows when it is doing disk io, because there is extra electricity on the bus at such and such an address.

For the longest time programmers have relied on conventional language constructs to denote specific behaviour, start(), new, delete, close(), etc.

Now how do we, for Frank's sake, define programming symbols that denote discrete and predictable series of actions?

A poor example to demonstrate. We'd like to define a behaviour for a set of programming statements, the behaviour is for the application to make some data visible to the user. Here is a sample symbol definition:

typedef userVisibleData = { getData showUser }

A getData function can get a number from a File, user input, a web site, or thermometer, doesn't matter.

A showUser function can put the number to Screen, disk, printer, wiggle the mouse, whatever, doesn't matter.

A userVisibleData symbol that denotes a set of programming statements taking a number from my web site and printing it to a user's screen should be assignment-compatible with a userVisible symbol that takes a image from Peter's fax and prints it to Paul's digicam screen.

So, Mr. Lount, how do we go about doing type inference here? Can we? More importantly how do we do it so that programmers don't start exploding? Andrew

Mark Evans - Re: The Case for First Class Messages

5/27/2004; 2:30:07 PM (reads: 1181, responses: 0)

Frank, your article might link to projects like Starkiller as concrete examples; discussed previously on LtU. They offer pedagogical advantages in this context. Python is a typical poster boy for the other side. Showing that benefits accrue to Python is more compelling to them than more obscure studies like Haskell (beautiful though Haskell is).

Although these discussions often circle, they can be worthwhile if, in the end, they bridge the academic mainstream to the programming mainstream. That is why your engagement is valuable.

Marcin Stefaniak - Re: The Case for First Class Messages

5/28/2004; 3:55:31 PM (reads: 1127, responses: 8)

Concerning static vs dynamic typing, I have yet another point of view.

I prefer static typing. Moreover, I prefer Java than Ocaml (and alikes), because I primarily use Eclipse IDE - and it has utterly useful features (namely: code assist and auto-completion, refactoring, on-line compiling), which would be really hard to achieve with a dynamic-typed language like Smalltalk, Lisp or JavaScript.

And the ML-style type inference is a little gain concerning that in Eclipse there is code auto-completion, so that I don't have to write all that code manually letter-by-letter. You can write Java code in Eclipse pretty fast. And you know, the Java source will be a little longer, but there type tags there are very helpful to understand the code (comparing to ML)

Moreover, types can be used to specify interfcases between part of the application written by diffrent programmers. And they can be used to prevent average programmers from writing code not along the application designed architecture, that is - writing bad code.

Avi Bryant - Re: The Case for First Class Messages

5/28/2004; 4:48:35 PM (reads: 1122, responses: 0)

... refactoring, on-line compiling), which would be really hard to achieve with a dynamic-typed language like Smalltalk

Oh dear. Where do you think those features *came* from? Eclipse is, like VAJ before it, simply an attempt by a bunch of Smalltalkers to write a Java IDE that mimics a few of the features that Smalltalk environments have had forever. This isn't "we were first" chest-beating, it's just history: OTI was a Smalltalk company before they got bought by IBM and started doing Java tools (not necessarily in that order, I'd have to check), and the influence shows.

Josh Dybnis - Re: The Case for First Class Messages

5/28/2004; 6:08:32 PM (reads: 1126, responses: 3)

I primarily use Eclipse IDE - and it has utterly useful features (namely: code assist and auto-completion, refactoring, on-line compiling), which would be really hard to achieve with a dynamic-typed language like Smalltalk, Lisp or JavaScript.

Wow! I've heard this before, but it still blows me away how widespread it is that people think these things were invented by the Java community (or worse yet, by Microsoft).

- The first (and maybe still the best) refactoring tool was for Smalltalk.

- Smalltalk development is all done "on-line"

- Auto-completion has been around just about forever. I don't know when it was developed. GNU EMACS has had it since the 80's.

These misconceptions give credibility to the argument that lack of exposure was the main reason Smalltalk, Lisp, etc. never gained widespread popularity. If programmers aren't aware of these tools now, with the information accessible on the net, it is not surprising that they weren't generally recognized 15 years ago either.

Daniel Yokomizo - Re: The Case for First Class Messages

5/28/2004; 6:27:36 PM (reads: 1113, responses: 3)

<rant>

You can write Java code in Eclipse pretty fast. And you know, the Java source will be a little longer,

The code won't be a little longer, it's usually 3 times longer, without considering type information. As Java doesn't have a simple syntax for declaring anonymous functions we have to resort to hacks like anonymous classes when the need arises or repeat the same blocks of code cover and over, like for loops.
Also we have to carefully duplicate code that could be parametrically polymorph taking special care when dealing with primitives. Eclipse is a wonderful tool and it makes writing Java code much less painful, but in the end of the day it's still faster to write reliable code in SML or Haskell or O'Caml: it'll take fewer characters and the type system will catch most of the bugs.

but there type tags there are very helpful to understand the code (comparing to ML).

In languages with type inference (like ML variants) you can write type annotations if you want to. People usually do that to provide documentation, but even if it wasnt written you can ask the top level the type of any expression, so this information is just a couple of keystrokes away.

Disclaimer: I was until the end of 2003 a full time Java developer and I still use Eclipse regularly.

</rant>

Isaac Gouy - Re: The Case for First Class Messages

5/29/2004; 12:22:54 AM (reads: 1095, responses: 1)

The first... refactoring tool was for Smalltalk
IIRC William Opdyke's research prototype was for C++
John Brant and Don Roberts later work was for Smalltalk.

Patrick Logan - Re: The Case for First Class Messages

5/29/2004; 10:05:52 AM (reads: 1080, responses: 0)

OTI was a Smalltalk company before they got bought by IBM and started doing Java tools (not necessarily in that order, I'd have to check), and the influence shows.

Yes, OTI was an international Smalltalk company. IBM bought them for this reason, before Java became the "It Girl".

Patrick Logan - Re: The Case for First Class Messages

5/29/2004; 6:32:06 PM (reads: 1066, responses: 0)

William Opdyke's research prototype was for C++ John Brant and Don Roberts later work was for Smalltalk.

Opdyke wrote a thesis using C++ as the example language. There was no implementation as a part of the thesis, at least nothing significant.

Brant and Roberts (also from UIUC under Ralph Johnson) implemented the first tool to correspond to Opdyke's thesis. They chose Smalltalk as the implementation and target language because it was significantly easier than C++ for both purposes.

Patrick Logan - Re: The Case for First Class Messages

5/29/2004; 6:37:16 PM (reads: 1070, responses: 0)

Auto-completion has been around just about forever. I don't know when it was developed. GNU EMACS has had it since the 80's.

Earlier for the Lisp Machine and probably Multics Emacs and others.

Josh Dybnis - Re: The Case for First Class Messages

5/30/2004; 12:15:08 PM (reads: 1063, responses: 0)

IIRC William Opdyke's research prototype was for C++

Was the tool itself ever released? Is it availible anywhere?

Marcin Stefaniak - Re: The Case for First Class Messages

5/31/2004; 3:38:48 AM (reads: 1025, responses: 0)

Avi, Josh - I do know the genesis of "refactoring", OTI Labs, Eclipse and so on, but it doesn't matter. My point is: in static typed language (example, Java), refactoring can be much more useful than in dynamic one (Smalltalk).

Consider renaming a method. In Java it is done without breaking application (so long as the code doesn't depend on Java reflection, to be honest), while in Smalltalk refactoring tool there is no such a manipulation. Same for code assist and compiling -- they are much more useful in a static typed language. In Smalltalk and Emacs these features are a mere shadow of what they are in Eclipse JDT, and the primary reason for that is the dynamic typing of Smalltalk and Lisp.

Daniel - I admit that ML family are superior to Java due to parametric polymorphism. Unfortunately, I'm already addicted to refactoring, code assist and on-line typechecking. I dream of an IDE for an ML-style language with these features, but I don't know any. Surely it is a big effort to develop one.

Peter William Lount - Sub-Universes - NOT!

5/31/2004; 6:03:50 PM (reads: 1014, responses: 0)

Frank wrote: ... a fairer comparison is one that rather embeds the untyped language in the typed language (using a universal type), since that preserves all the dynamic and static information about both languages. When you do this, you exhibit the untyped language as a subuniverse of the typed language, and exhibit static typing as letting you create many little customized subuniverses, one for each type.

There is a serious problem with the "sub-universe" view in that "typed" languages are not super-universes of "untyped" languages. Why? As soon as you "type" a variable you are "constraining" it and thus it is no longer - by definition - dynamic and it loses an essential defining characteristic. Thus a typed system is not a super set or super universe of untyped systems.

This is especially the case for "staticly typed" languages since they produce "fixed" and "type hardened" code that would break if presented with a type outside the type definition. Ok, often for "similar" types type conversion is performed (i.e. integer to real or vice versa).

A note about the "universal type". In languages like Smalltalk all variables (and method parameters) are of one type: Object. You could say that this is a universal type, however, since variables are of no other type and are always of only one type in Smalltalk this is much different than having a mixture of a universal type with other types in a system. It's important to seperate "variable types" from "object types" in this discussion since classes take over many of the roles of types in languages like Smalltalk. Since there are many "classes" in Smalltalk it has many "types of objects" but "variables" can hold any "type" at any time. Much of the flexibility of Smalltalk comes from the freedom gained by being free of "variable type constraints". I like to think of this as "dynamic freedom" as it makes a huge difference in programming style and thus results.

The distinction between "types" and "variable type constraints" is crucial for understanding the source of the expressive power of untyped languages. While "integers" fit within the larger real or "rational" number system this kind of nesting doesn't work for "typed variables". It does work for "abstract data types" and "classes" in that you can have type or class hierarchies. However, neither typed or untyped variables can be a super or sub universe of the other (as mentioned above). Although it seems that untyped variables are the general case with no limitions while typed variables are more specific due to the constraints added to them. However, I don't think that this one aspect alone would enable the classification of typed variables as a sub-universe of untyped variables. There are other features of typed and untyped variables that come into play such as statically compiled code v.s. dynamic message lookup (early binding v.s. late binding of messages).

Static compilation of the "type constraint" on a variable or parameter creates "rigidity" and "brittleness" in the program. In some cases it doesn't matter but in the vast majority of cases the program is limited beyond what is necessary. Not only do programmers become "type happy" and over constrain the types of variables, they spend too much time thinking about this! The key word here is "spend" as it takes real amounts of time to specify the types. Time (and money) that is best used for other purposes.

The type constrants that seem to be most often applied in object oriented programs are on complex object "classes/types" that have a mix of variables pointing to other complex objects or basic value objects (strings, numbers, nil, booleans, etc...). Often these complex form hierarchies and networks. The kinds of static type compilation optimzations available are limited with complex objects. The loss of dynamic message sending by compiling static message sends is a major cause of "brittleness" in that message lookup is no longer dynamic but becomes frozen to the compiled type.

It's no wonder that people who don't use interactive environments, such as Smalltalk's advanced IDE, believe that "types" are the way to go. Staticly typed variables are one way to provide "type constraints" to ensure program safety. They makes sense in a language where there is no run-time meta data access. In languages, such as Smalltalk, Self, Zoku, etc..., where there is a rich amount of meta data the loss of dynamic flexibilty costs too much. Thankfully dynamicly untyped langauges are able to exist with run-time meta data providing powerful solutions that enable highly dynamic and safe systems.

Run-time "type" safety in untyped languages, like Smalltalk, is more powerful than static type ensurance operations commonly found in current typed languages. This is especially the case for any language without run-time meta data. This extra flexibility comes from the linguistic expressiveness of the language and access to the meta data. All kinds of "type tests" from the simple to the complex including but not limited to: simple type tests for "class" (type), class or any of it's sub-class tree, any of a set of classes (and their sub-class trees), any of a set of specific object instances, any "value" tests, any "object identity" tests, any "complex object structure" tests, type error handling, plus more. Static langauges usually only provide the simplest "class" test for a class and it's sub-class tree (it's notable that this feature kills a major source of dynamism in a language). Staticly typed and complied programs need to be recompiled in order to have their types change. Programs written in dynamic languages, such as Smalltalk, can have the "type tests" changed at on the fly at run-time. In fact, type tests themselves can be made into first class objects thus providing a full object oriented approach to dynamic type tests.

One benefit of typed variables is that they offer a quick way to have a "class/type" test applied to a variable. However, I've found that much of the time that a test is needed no an object type, in my Smalltalk programs, the simple class test isn't enough. A more complex "validation test" is needed. So for all the benefits touted for staticly typed variables, they have a severly limited set of type tests available to the user. The power of dynamic type testing and run-time object validation testing far surpasses what is available for simplistic type tests.

Error handling when objects don't pass "type" validation tests is another powerful capability that is essential in keeping programs safe while maximizing their flexibility and dynamic features. Most typed languages don't provide these powerful error detection and handling choices as typed languages catch the type mismatches at compile time.

Changes in the definition of a type requires the re-writing and re-compilation of all of the affected code in the system, which often involves recompiling the whole system. In an untyped language like Smalltalk changes in class definitions require re-writting of affected code. Since the Integrated Development Environment (IDE) provides "incremental" on the fly compilation only those specific class and methods affected need be recompiled. Usually this is a much smaller set.

There is little point in expanding the set of type tests in a staticlly typed language since you'd need a much more expressive syntax to convey the meaning. The best way to accomplish this is with the programming language itself. This implies having rich run-time meta data available to implement the tests. Of course you could do it at compile time, but you'd be limited to only those possibilities that work in the compile time situation plus you are still stuck with statically compiled and thus rigidly frozen in time code.

The "sub-universe" view is too simplistic to accurately represent what is going on with typed and untyped language issues. In fact it's a view with serious flaws. A model needs to take into account all aspects; if aspects are lost in the details the model won't accurately represent what it's attempting to model or explain. Typed variables are at a major fork in the road as far as language issues are concerned with neither typed or untyped being a sub or super of the other. Take your pick, the low road of typed variable languages or the high road of untyped dynamic languages.

I'm continuing my efforts to deepen our understanding of these different directions. I wonder how much of current programs, in typed or untyped lauguages, are concerned with "type safety", "object existance", "object type tests", "object value tests" (i.e. testing the variables of an object), etc...?

Peter William Lount - Re: The Case for First Class Messages

5/31/2004; 10:11:38 PM (reads: 986, responses: 0)

Frank wrote: {dynamic untyped languages erase} "the type information which, after all, is the whole point of static typing. What you are left with after this erasure is only the dynamic semantics, that is, the `behavior' of the program."

The "type" information isn't "erased", it's moved and only a small part of it is "erased". The majority of the information about the type is moved from the variable to the object contained with the variable. Any dynamic run-time object type and value validations that are needed for the variable can be moved to methods accessing the variable and thus become explicit. The part that is erased is the part that is used to "compile" early binding of messages and specific type tests on the code at compile time.

"Since the typeable lambda-terms are a subset of the untyped ones, it appears that you lose some ways of expressing dynamic behavior."

Let's clarify by extending your statement: "... you lose some ways of expressing dynamic behavior" when you type variables. That's what I've been saying.

"So you can say that this sort of comparison presumes (a priori) that the only important thing about a program is how it behaves at run-time."

No I'm not saying that "the only important thing about a program is how it behaves at run-time". There are certainly things that can be learnt about a program from it's source code which is what "static variable type" compilers do. Certainly to those languages it's important.

What I am saying is that it's not necessary to have static types for a program to run. What I'm saying is that the behavior of a program at run-time is a very important aspect. I'm also saying that the most important type information is preserved - not erased - by virtue of the classes chosen and run-time dynamic type tests applied in an untyped version of a program.

I'm asking: what compelling benefits, if any, does the type information that is "erased" provide? It doesn't provide us with the "class" information since that's already been transferred to the class. It doesn't provide us with anything but the simplest of "type" checking validations. It does provide a set of compile time code analysis that can lead to optimizations. What else does this so called "erased" info provide us with?

It's like this; yes "relational database" theory has it's merits and seems to be consitent in it's own way. Certainly there isn't a theory for "object databases" at this time that is as well developed. Does this mean that object databases should be ignored and only relational databases used? No, there are compelling benefits to object databases. Do these compelling benfits imply that relational databases should never be used? No, it all depends upon what the desired outcomes are. What is needed is an understanding of the costs (technical and monitary) of using a relational database as compared with using an object database. The same goes for typed and untyped languages.

At one end of the spectrum staticly typed variables provide a well defined "grove" that constrains the "type" of information that can be placed in all of the variables, input parameters and return values of the source code of functions. Only "things" of the correct "type" can be placed into these "containers" known as varaibles. By placing these "type definition" constraints upon the "containers" a program, like a compiler, can follow straight forward algorithms to determine if the program is "type consistent" within itself.

Unfortunately this well defined grove created by "typed variables" becomes a self imposed "rut" that can be difficult to get out of as a program grows and evolves.

At the other end of the spectrum untyped variables provide an open environment where variable containers can hold onto any kind of object in the system. In general variables are constrained by "pathways" that objects take through the code. When an object is created and assigned to a variable and passed around to other objects via methods a path is formed. This is similar to the "static groves" and in many cases the object's path might look the same. However there is a big difference, the object is free to be redirected to another path at anytime the code chooses. This is possible due to variables being untyped and by the existance of polymorphic message sends that are determined at run-time. Any need to keep objects along well defined paths or groves can be handled during run-time by the appropriate validation checks.

Static type definitions on variables constrain variables. That is their purpose. Untyped variables can certainly have the simple "type/class" tests applied at run-time. It's also certainly possible to over-constrain untyped variables with too many run-time tests. I suspect that this is a common tendency for novice untyped language programmers who have lot's of static type programming experience. It's also an issue for very code that is highly specific yet can have lots of different and unpredictable kinds of objects input as parameters. Take Graphical User Interface code where the user can drag and drop any kind of object. It's best and prudent to apply common sense data input validation in these cases to prevent invalid situations that don't make sense. The challenge is finding the right balance between "validation" checks and "dynamic" and "open ended" choices. This is where learning how to write generalized code comes in to play and where the power of polymorphism shines.

Static typing of variables over constrains polymorphism and thus imposes limits upon the lanugages capability to provide generality, flexibility, dynanism and "robustness" in a program. Applying too many or overly tight validations upon a variable can also have this effect. The key is learning what balance works best in the programs that you are creating.

As for "erasure" let's look at an example to get clear. In a staticly typed language I could write:

int aCounter; aCounter := 1; ... aCounter++;

The variable "aCounter" now has an "integer" type definition associated with it. This says to the compiler to only allow "integers" into this variable and, if possible, convert anything being assigned to be converted to an integer otherwise generate a "compile time" type error and prevent the program from ever running. There are then some other statements "...". Finally "aCounter" is "incremented" with a special syntax operator that the compiler knows how to apply to a few types (i.e. pointers and integers).

In the C language, the type of the variable is the only way that the compiler knows what the object in the variable is. That is one the main purpose of types in C, to inform the compiler of what operations will be allowed on the data and how and where it can be moved and copied. In untyped object oriented languages such as Smalltalk the objects carry their "type" information with them. This is a major difference and creates a major fork in the road map of programming languages.

Similar code in Smalltalk that defines a temporary variable, initializes it, perfoms some code "...", and then increments the counter.

| aCounter | aCounter := 1. ... aCounter := aCounter + 1.

In most cases the code in the above two examples would be inside a loop. The end result is the same. In simple cases "aCounter" doesn't take on any other values and will always be an integer. In more complex cases - and the cases that I'm going on about - the "data" or "object" going into a variable isn't clear as the "path" from the "data" or "object" creation to the ponit where it's copied into the variable can be quite long. In a staticly typed language the path is a well defined "grove" so it's type is fully predictable. Some people like that,. but it doesn't mean that statically compiled programs will have no errors. Just look at Microsoft Windows XP! It's a very large C, C++, C# (and who knows what else) system.

In a dynamic untyped language the object going into a variable could be anything which scares many of the "static bent". However the fact that Smalltalk programs are formed from well structured objects and methods creates "well defined" yet "dynamic" pathways throughout the program. This alone goes a long way to ensuring that only the "right" objects get to where they should be. Of course bugs are always fun to track down and dynamic programs have their share of bugs including bugs that allow unexpected objects to get assigned to a variable. Sometimes this happens as a result of carelessness. Other times it's an indication of the need for some validation checks. Most often it's a blantant mistake and the code just needs to be fixed. Sometimes the architecture of the objects needs to be adjusted or corrected or rethought. It's suprising how infrequent these kinds of bugs happen in well tested production code. When they do happen it's usually an indication of a deeper architectural issue, most likely caused by programmers rushed by project deadlines.

The prudent programmer using a untyped language like Smalltalk will take redementary precautions to protect his/her objects and methods from wayward objects. That's what validation checks are for, as well as writing excellent documentation. It's also what well designed polymorphic protocols solve.

Design isn't always an easy thing to apply to a program you are implementing, but program quality is important in many systems. Unfortunately quality isn't the main driving factor in most projects. Usually that's cost, time and scope with quality being dropped to keep the primary three factors balanced. Please see Value and Result Focused Software Design Process for more details and a diagram of project issues at play.

In conclusion, static types have their place in languages that don't have objects. In object oriented langauges static types on variables have proven to not be needed even though it's possible to add them. The addition of static types to object oriented programming languages isn't necessary and clutters up the design of programs by adding extra "verbage" (which clutters up the programmers minds). Static types also increase the "rigidity" and "brittleness" factors of programs which is already difficult enough to combat. In today's world it's often difficult to keep things simple without being simplistic.

scruzia - Re: The Case for First Class Messages

6/1/2004; 12:14:40 AM (reads: 994, responses: 2)

Using Eclipse, Marcin doesn't "... have to write all that code manually letter-by-letter. You can write Java code in Eclipse pretty fast. And you know, the Java source will be a little longer..."

Daniel Y. replied that "Eclipse is a wonderful tool and it makes writing Java code much less painful, but in the end of the day it's still faster to write reliable code in SML or Haskell or O'Caml."

But you're both ignoring the looking-at-code side of things: the fact that the Java code is so much longer, and contains so much boilerplate, makes it take much longer to read and understand Java code. Does Eclipse help you see at a glance that two half-page globs of Java code are identical except for a class name or a variable name?

In many cases, Haskell can help with that because those two chunks of code will often be two or three lines instead of a half a page each.

Code gets written once, and read many times (during debugging, maintenance, revision, ...), and these discussions too often ignore the importance of the source code as a vehicle for communicating an algorithm to other programmers. Including the original author, months or years later.

Marc Hamann - Re: The Case for First Class Messages

6/1/2004; 10:53:24 AM (reads: 934, responses: 1)

the fact that the Java code is so much longer, and contains so much boilerplate, makes it take much longer to read and understand Java code

I have to disagree with your premises here.

Longer code is not necessarily less clear. Often the opposite; think a Perl "one-liner" or a pithy mathematical definition that needs to be mentally "unpacked".

Likewise, Java can be written in a very clean style that makes it clear what is going on.

I think the desirable property is actually that the each of the elements of the algorithm, whether iterative steps or recursive cases, has a single, clear representation in the code.

To my knowledge this is necessarily a product of style, rather than an enforceable or invariable property of any PL.

I do, however, agree strongly that readability of code is generally more important than speed of compostion or even performance of execution in many cases.

Peter William Lount - Re: The Case for First Class Messages

6/1/2004; 6:21:12 PM (reads: 831, responses: 0)

I have to agree with all of Marc's excellent points including the last one about performance. Java code can be written in a clear style. Smalltalk code can be written in an ugly style or a clear style, as is likely the case with any programming language (except maybe LISP and APL! ;--) I'll take the clear version any day.

Please see the article "Syntax & Clear Literate Programming Style Improves Programs" for my full comments regarding Marc's post above.

scruzia - Re: The Case for First Class Messages

6/1/2004; 10:19:43 PM (reads: 820, responses: 0)

I have to disagree with your premises here.

You have inferred premises that I did not intend to imply. Of course the scale is not linear, and it's subjective, too. For my tastes, APL is way too short for clarity, as is most Perl. Indeed, Longer code is not necessarily less clear, and I didn't claim that it was.

I wholly agree about the desirability of ... a single, clear representation in the code ... for each element. That's one of the ways that Java falls down, due to its omission of even the most basic type inference. Compare Scala's

var x = new VeryLongClassname;

versus Java's

VeryLongClassName x = new VeryLongClassName;

where you have to type (and read) the class name twice.

I disagree somewhat with Paul Graham's claim that Succinctness is Power, because my tastes would like to see a little bit of confidence-building redundancy and because Power, to me, includes the Power to communicate an algorithm to other humans. I just think Java requires a little too much of that kind of redundancy.

Peter William Lount - Re: The Case for First Class Messages

6/2/2004; 12:10:57 AM (reads: 811, responses: 0)

I think that clarity is power. I think that literate programs assist in clarity. Many of the class, object, method, parameter and variable names that I use are quite long. They are that way for clarity by conveying the appropriate meaning. It's a good thing that I can type fast.

Where succinctness comes into play is having a succinct syntax that makes reading the program a breeze when literate naming expresses ideas and concepts well. Smalltalk enables succinctness of syntax with expressiveness in naming. One of the reasons that I like it so much.

The Zoku variant of Smalltalk that I'm working on keeps with this tradition and hopefully will work well for people. Cheers.

Frank Atanassow - Last Exit For The Lost

6/3/2004; 10:55:28 AM (reads: 754, responses: 0)

This is my last message to this topic.

Marcin: And the ML-style type inference is a little gain concerning that in Eclipse there is code auto-completion, so that I don't have to write all that code manually letter-by-letter.

Auto-completion is in no way a substitute for type inference. Completion only saves a few keystrokes in typing long identifiers; perhaps this is useful in Java where (excepting the new generics) the type system is nominal and every type is denotable by a single identifier. But in a language like Haskell, where the type system is much richer and all the interesting types are rather expressions and not simple identifiers, code completion will not help you. If it could, it would have to actually do type inference, or you would have to declare a type synonym for each type you want to use so the completer could find it.

Mark: Frank, your article might link to projects like Starkiller as concrete examples

As I recall, Starkiller is based on the idea of inferring annotations by treating a Python program as if it were the type erasure of a typed program. As an example of the type erasure paradigm, this would be OK, but it misses the point I made about type erasure as a means of comparing typed and untyped programs.

Andrew W: Frank seems to have behavioural types confused with signatures or roles.

I'm not confused. I regard a type as an algebra or coalgebra, not merely a signature for one, and not necessarily free or cofree, so it may include `behavior'. I don't know what a `role' is.

Sorry but I cannot reply to the rest of your post because I couldn't parse/understand most of it.

Avi: The flip side is that comparisons made by advocates of static typing usually ignore the possibility that a program might *change* at run-time... Unless you consider that possibility, your comparisons are going to be no fairer than type-erasure, and of no interest or persuasive power for users of languages such as Smalltalk and Lisp.

The universal embedding certainly allows for that possibility; just think about it. I will address this in my article, though, thanks.

Peter: As soon as you "type" a variable you are "constraining" it and thus it is no longer - by definition - dynamic and it loses an essential defining characteristic. Thus a typed system is not a super set or super universe of untyped systems.

Peter, you are truly hopeless and only aggravate me, so I'm going to bow out of this discussion now. Earlier you asked for some references to the literature. You'll find some material here. Good luck.

Marcin Stefaniak - Re: The Case for First Class Messages

6/4/2004; 1:42:33 PM (reads: 685, responses: 0)

Well, this topic is badly overgrown, so this is my last message too.

1. Frank, you are right auto-completion is not the same as type inference. But it is not only saving keystrokes utility. Actually, the feature I think about is properly called "code assist", and it works so: on pressing Ctrl+Space, when the caret is right after the dot after object expression, a pop-up window appears with possible methods and fields.

So it's not an ordinary plaintext auto-completion, but a very contextual one. It's more than completing long identifiers.

Concerning programming languages, there are two major aspects that influences the value of code-assist. The first is type system - static is better suited for that purpose than dynamic. The second is syntax. The OOP syntax list.add(x) may be slightly better here than functional List.add(l, x). There are, however, languages where syntax is against code assist, SQL for example.

Code assist in Haskell? Sure, why not -- let the IDE do some type inference on-the-fly, it's not impossible.

2. Avi: programs changing at run-time? Sure we do that.

Imagine a messenger system. It has two parts: client and server, which are frequently and independently updated. Example of changing program as good as the Smalltalk System. Yet it is written in static-typed Java (reflection is used, but only for convenient dynamic proxies).

The communication protocol is by no means done with Java RMI, because it is dependent on Java Object Serialization, which in turns requests that both peers JVM's contains the same classes. Instead, the protocol depends on somewhat simpler data layer, implemented in static-typed Java of course. BTW, that data layer is very similar to the Frank's "universal type" idea; it does immerse dynamic-typed data into the static-typed Java.

And the moral is: to develop high quality software systems, keep the dependencies tight.

Peter William Lount - Re: The Case for First Class Messages

6/4/2004; 2:09:46 PM (reads: 686, responses: 3)

Frank wrote: This is my last message to this topic.

I'm sorry to hear that. Your on topic comments are well thought out, logical and quite reasonable. Obviously you've put some thought into them. I appreciate that and I'm sure others do as well.

Frank wrote: "Earlier you asked for some references to the literature. You'll find some material here. Good luck."

Thank you. I've been to your web site before. Could you provide very specific links (an article or paper perhaps) that would support your point of view?

What about the questions that I asked about what information is lost in "type erasure"? What sources enumerate all the various forms of info that is "lost" when a program moves from being "typed" to "untyped"?

Peter wrote: As soon as you "type" a variable you are "constraining" it and thus it is no longer - by definition - dynamic and it loses an essential defining characteristic. Thus a typed system is not a super set or super universe of untyped systems.

Frank wrote: "Peter, you are truly hopeless and only aggravate me, so I'm going to bow out of this discussion now."

Well my friends have called me hopeless before, but to receive a promotion to "truly hopeless", wow! I just don't know what to say about yet another adhomin personal abusive attack. I apologize if you feel aggravated by my posts, that wasn't intended. While I understand your arguments I disagree with a number of your conclusions.

I'm intending to engage is a professional quality discussion regarding first class messages, and to explore "typed" and "untyped" systems to find out the compelling benefits that both offer. This is in the context of determining what features to support in the Zoku language variant of Smalltalk that I'm developing.

Whether or not a language supports typed variables seems to have a huge impact on the language and the available meta data in the language (i.e. their support for first messages as first class object). In theory this shouldn't be the case but in practice language implementations are humbled by the reality of current computers and systems they must interoperate with.

Frank wrote: What this has to do with integers and reals is the following. If I want to compare reals and integers, I can do it by mapping the reals into the integers, or the integers into the reals. It is easy to see that, because of cardinality, the former will necessarily lose information.

Yes this makes sense for reals and integers (except for cases like 2.0 going to 2 where no info is lost), but it's not as simple as that for typed and untyped variables because "variables" (of various kinds such as instance variables, temporary variables, parameters or return-types) are quite different from numbers.

It's very important to distinguish that I'm not talking about "number or object types" in a language that are contained within the variables. I'm talking about the "variable slots" that contain the information. Your parallel analogy works fine for the "numbers or objects" themselves in a language, just not the variable/parameter slots that they are contained within or referenced by.

Based on such a comparison, no reasonable person would then conclude that the integers are as generic/reusable/whatever as the reals.

Ah, the crux of the matter.

In my view the parallel you are drawing with "typed-untyped" and "real-integers" while reasonable, doesn't apply to "variable slots". Why? There are relevant dissimilarities between numbers and variable slots that make the analogy you are drawing a false analogy. Back to this in a moment.

I also take exception to the form of part of your argument. The co-joined statement "... no reasonable person would then conclude ... " that implies one would have to be "an unreasonable person" to "conclude" that "untyped variables" are NOT a "sub-universe" of "typed variables". Your statement is structured such that if I disagree that your "comparison" applies and reach different conclusions I must therefore be "an unreasonable person". (Is this the source of your calling me "truly hopeless"?).

This flaw in your argument comes from the conjoining of the "comparison" with "... no reasonable person would then conclude ...". I agree with your conclusion regarding integers and reals, information is lost converting from reals into integers. I understand the parallel analogy you draw from converting reals into integers to typed variables into untyped variables, but as stated above, I don't agree that it applies. By conjoining these two separate points into one presupposition the reader is stuck agreeing with you in order to be a reasonable person. Not much room to disagree since who really wants to be unreasonable? Another alternative for the reader is recognizing the form of the argument as a faulty conjunctive statement and point it out as is done here.

Why doesn't your parallel analogy apply? What are some of the relevant dissimilarities between numbers and variable slots (variables, parameters, temporaries, etc..) that make the analogy you are drawing false?

Information is potentially lost in both directions with variable slots but not with numbers as only info is lost going from "reals into integers". (Excluding limits on machine precision that is). Since information is lost in both directions the analogy can't apply since it states that information is only lost in one direction. Significantly, the lost of information in both directions means that neither typed or untyped variable slots can be a sub set or super set of the other - if you want an accurate model, that takes into account the loss of information in both directions, that is.

Type Erasure
Going from typed variable slots to untyped variable slots the information available to the compiler at compile time changes. The type of each variable is no longer available at compile time. Actually in some cases it can be determined by tracing the program flow pathways at compile time.

At first glance this seems like a disaster, however, languages like Smalltalk thrive as the vast majority of the type information is not "lost" or "erased". It's moved or "transformed" into the "class" or "objects" that will be placed into the variables.

In addition techniques are available to determine the types of untyped variables if it's really needed.. Type inferencing is one way that can determine what was once explicitly defined by the programmer by recording the set of object types that a variable actually holds during run-time. The Self language uses this very effectively to dynamically recompile customized and optimized versions of methods based upon the parameters determined at runtime.

Varibles form pathways for information to flow through a program's code. Some of the type information that isn't moved and thus what might be "erased" is the "type restricted" aspects of pathways through the code that restrict objects to only travel along certain "typed" pathways (unless converted to another type).

In dynamic systems these "type" pathways can be observed and "recovered" at run-time by "type inferencing" systems that record the "object types" in actual use. This information can be helpful in debugging, testing, performance tuning (customizing code generation with optimized code after observing actual "object type" usage), vetting program design, learning how a program is working, maintenance, conversion to a typed language, and many other reasons.

A key advantage of untyped dynamic languages is that the programmer isn't spending time specifying the types of variables. The programmers focus is on working with the objects and their interactions with each other, which is where the time is better spent.

Dynamic Erasure
Going from an untyped variable slot to a typed variable slot the dynamic late binding capabilities are lost when they are "fixed" and "hard coded" by static binding compilers. For some simple cases this may not make any significant difference (except longer and more complied source code) since the variable would only ever have one kind of object in it anyway (i.e. a local loop counter). However, the more variables that are typed in a program the more rigid it becomes and the greater the loss of dynamic capabilities.

Go too far with typing and a program becomes so stiff that you are no longer simply adding type information but you must refractor it to bypass the rigid type constraints being applied. Thus "type erasure" likely involves - potentially significant - refactoring to make a dynamic program static. Refactoring may also be needed when untyping a type programs variables in order to benefit from the dynamic capabilities. Simply erasing the types may not give you the equilivant dynamic capabilities. Yet another reason typed and untyped variables have disjointed features and capabilities.

Dynamic capabilities are erased when constraints are placed on variables. This is the case regardless of whether this is done at compile-time or via run-time type checking validations. In statically compiled languages the restrictions are hard coded everywhere the "type" is used which produces program rigidity. Too much rigidity and programs become brittle, less general, and difficult to evolve and manage.

As a result of the information loss in both typed and untyped systems neither can be a sub set or super set of the other. A more accurate model would be two mainly disjointed sets of capabilities and features that overlap, to a some degree, with a partial common set of some capabilities.

Conclusion
I am still of the view that the costs and impact of typed variables is too significant on software development in the light of the clear and overwhelming benefits gained from dynamic untyped variables. These benefits are demonstrated by systems such as Smalltalk.

I'm am, however, still attempting to find out the compelling benefits of compile time type technology. Run-time type inferencing seems to be of benefit and provides much of the capabilities to untyped languages that compile time static typing offers without the costs of the programmer specifying types or the program becoming brittle.

I'm willing to change my view when presented with information that supports a view and that can sustain review. Naturally I submit my views to the same standard.

I hope that I've presented this point of view in a clear manner. I appreciate your questions, comments and dissenting on topic non-personal-directed opinions.

Dominic Fox - Think about this...

6/4/2004; 4:11:14 PM (reads: 684, responses: 0)

Type inferencing is one way that can determine what was once explicitly defined by the programmer by recording the set of object types that a variable actually holds during run-time

But that's not what type inferencing means; at least, it isn't what everybody else means when they say "type inferencing" (you're welcome to redefine the term for your own use, of course, but you should perhaps telegraph your intention to do so).

I don't think you actually know what type inferencing means (that is, you don't know what everybody else is talking about when they talk about it). That's OK - neither do I (at least, not well enough to give an exact account of what it involves) - but I think you should know that you don't know.

There are some good, accessible texts linked from Frank's site - I recommend the Cardelli and Wegner tutorial On understanding types, data abstraction and polymorphism, which discusses the difference between the typed and untyped lambda calculi and goes on to talk about type checking and inferencing. It seems to me that the terms of that discussion are basically the terms of art in this field, and that the way Cardelli and Wegner use them is more or less (and I'm not expert enough to tell you where usage among experts diverges) the way they are commonly used by people who know what they are talking about.

As regards ad hominem, there is a difference between attacking your arguments based on unjustified inferences or insinuations about your character and commenting on the adequacy (or otherwise) of your demonstrated awareness of the subject under discussion. What you really know in the privacy of your own innermost thoughts is neither here nor there - you could be shamming ignorance, and it would still be quite legitimate to point out that you appeared ignorant. I don't want to go spiralling off into speech act theory at the tail end of an already overlong and meandering discussion, but I would commend to you Austin's observation that, irrespective of the drama being played out on the inner stage of the mind, "our word is our bond".

Peter William Lount - Re: The Case for First Class Messages

6/4/2004; 9:53:34 PM (reads: 661, responses: 0)

Dominic: But that's not what type inferencing means; at least, it isn't what everybody else means when they say "type inferencing" (you're welcome to redefine the term for your own use, of course, but you should perhaps telegraph your intention to do so).

No need to redefine type inferencing to since I'm using the standard definitions which work just fine, as follows:

Chuck is a project that aims to improve the readability of Smalltalk programs by using type inference. Type inference can find types in programs even when the program's author was not constrained by a type checker to begin with. The inferred types can then be presented to the user and to other static analysis tool such as compilers and dead code removers.

While I don't agree that types improve the readability of Smalltalk, (in my view types clutter a language when written inline with the code), never the less, the above definition is not mine but fits within the standard definition of Type Inferencing. Types determined via run-time type inferencing aid in debugging and maintenance and serve other purposes that I mentioned. I would say that inferred types can assist one to understand how a program is actually running. That's a nice benefit to them and if they can be determined dynamically, which they can, that's great!

My statement Type inferencing is one way that (a program) can determine what was once explicitly defined by the programmer by recording the set of object types that a variable actually holds during run-time is a perfectly valid statement within the standard definitions of type inferencing. While "recording" might be a simple description of the type inferencing algorithm it fits and achieves the results. Besides it's not defining type inferencing it's indicating the net effect of how it can be used at runtime. Determining the types of intermediate expressions is also part of type inferencing and was described with "variables" (of various kinds such as instance variables, temporary variables, parameters or return-types) from my previous post. "Return-types" is clearly referring to the types of the return values from functions and sub expressions; all functions and sub expressions in Smalltalk involve message sends and thus return values the types of which can be easily determined.

"Type inference automatically assigns a type signature onto a function if it is not given. In a sense, the type signature is reconstructed from the compiler/interpreter's understanding of the function's sub functions with well defined type signatures, and thus the input/output type can be ascertained."

Sounds good to me. Let's assign "type signatures" to the variables based upon the "types of the functions" or expressions.

Let's see what the Self language people wrote about it in their paper "Type Inference of SELF Analysis of Objects with Dynamic and Multiple Inheritance""Self features objects with dynamic inheritance. This construct has until now been considered incompatible with type inference because it allows the inheritance graph to change dynamically. Our algorithm handles this by deriving and solving type constraints that simultaneously define supersets of both the possible values of expressions and of the possible inheritance graphs."

The paper continues:

The choice between static and dynamic typing involves a choice between safety and flexibility. The flexibility offered by dynamically typed object-oriented languages is useful in exploratory programming but may also be a hindrance to safety checking and optimization when delivering products. Henry Lieberman [8] and Alan Borning [2] developed the notion of object-oriented languages based on prototypes. The absence of classes and types in these languages yields a considerable flexibility which may be significantly increased by the notions of dynamic and multiple inheritance. These language constructs, however, make safety checking more difficult than for class-based languages."

This paper presents a type inference algorithm for the Self language [13]. Self is a prototype-based dynamically typed object-oriented language featuring dynamic and multiple inheritance. Our algorithm can guarantee the safety and disambiguity of message sends, and provide useful information for tools such as browsers and optimizing compilers. Although we focus on Self our work applies to other languages as well. Our approach to type inference is based on constraints, like in our previous papers on an idealized subset of Smalltalk. In [10] we defined the basic type inference framework, and in [9] we demonstrated an efficient implementation..

Close to what I have been saying!

Dominic: I recommend the Cardelli and Wegner tutorial On understanding types, data abstraction and polymorphism, which discusses the difference between the typed and untyped lambda calculi and goes on to talk about type checking and inferencing.

Thank you for the specific reference. I'll take a look at it as soon as I can.

Dominic wrote: As regards ad hominem, there is a difference between attacking your arguments based on unjustified inferences or insinuations about your character and commenting on the adequacy (or otherwise) of your demonstrated awareness of the subject under discussion.

Calling me "truly hopeless" is a clear case of "instead of attacking an assertion, the argument attacks the person who made the assertion. I didn't see any comment about my arguments along with Frank's unsolicited and irrelevant personal opinion about me in that post. Frank didn't make any statements in that post regarding the adequacy, or lack thereof, of what he objected to or what he saw as any "misunderstandings" of "terminology" or "concepts" on my part. Instead he left it as he did. This is a forum about programming languages not about me! Let's keep the discussion professional and non-personal. That's what I'm asking.

Dominic: It seems to me that the terms of that discussion are basically the terms of art in this field, and that the way Cardelli and Wegner use them is more or less (and I'm not expert enough to tell you where usage among experts diverges) the way they are commonly used by people who know what they are talking about.

As far as I can tell I'm using the correct terminology. I've double checked and referred to the definitions in this post to demonstrate that I'm using the valid terminology for the meaning of type inferencing. Please re-read what I've written carefully, sometimes it can take a couple of readings to understand some else's technical writings. If you have questions or observations please let me know them as long as they are on topic and non-personal. If you have a variation of the concepts that I'm unaware of please write something relevant about it that demonstrates the concept. If I'm missing a concept or two please let me know what they are. That's what I've been asking for since the beginning of this thread.

As for questioning my intentions I've stated my purpose and intentions for writing in this thread fairly clearly. As for knowledge I'm perfectly willing to acknowledge not knowing something. In fact I've expressly stated that I'm here to learn from any of you who have something relevant to contribute.

Dominic: it would still be quite legitimate to point out that you appeared ignorant.

Not if it's phrased in the form of a ad hominem personal attack. How I appear to you isn't really relevant. On a second or third reading of my posts you might find that your perception of me changes. Perceptions of people are temporal and are not what the inquiry is about. A serious problem with the sort of personal comments that a few of the people have been making in this blog thread is that they can lead to an intrepretation that the person is "baiting" the recipient rather than having a professional discussion. What is relevant are on-topic non-personal comments that further the inquiry of this thread.

Mark Evans - Re: The Case for First Class Messages

6/4/2004; 10:52:31 PM (reads: 651, responses: 0)

Frank - Starkiller claims its basis to be Ole Agesen's Cartesian Product Algorithm.

Dominic Fox - Re: The Case for First Class Messages

6/5/2004; 2:14:32 AM (reads: 643, responses: 1)

In dynamic systems these "type" pathways can be observed and "recovered" at run-time by "type inferencing" systems that record the "object types" in actual use

Type inference is typically a static operation, which has nothing to do with the "recovery" of type information at run-time. It's true that Starkiller, for instance, uses dataflow analysis to build a model of what will happen at run-time, and bases its type inference on that, but it is still static type inference. There is no "observation" of the behavior of running code involved.

Marcin Stefaniak - Re: The Case for First Class Messages

6/5/2004; 2:32:59 AM (reads: 640, responses: 0)

Peter: I'm intending to engage is a professional quality discussion regarding first class messages, and to explore "typed" and "untyped" systems to find out the compelling benefits that both offer. This is in the context of determining what features to support in the Zoku language variant of Smalltalk that I'm developing.

Well then, so you want to develop the best Smalltalk? Then you certainly must have a type-inferer (like Chuck), and use it to support decent code navigation and code assist.

Next consider using some type-check constraints, for example to show on-the-fly warnings when there is possibility of "object does not understand method". At this point it would be really close to Eclipse/Java feel. The warnings should be customizable over diffrent parts of the source - so one could allow or deny occuring such exceptional situations in their part of system.

Then consider extending the language with (optional) type tags, so that if one wants, one can define quite strict software contract beetween parts of system or freeze an interface of existing part. This is a good step towards the "literate programming", where the source is documentation itself; it is also helpful in the context of team programming.

As for the first class messages: yes, provide easy idioms for converting method invocations to Invocation objects and invoking them later. Then again, consider using type inferer/type tags to track the code that don't depend on this feature and compile it lightweight.

Similarly, consider infering/tagging code that depends on runtime type information (reflection), so that the code which does not can be compile lightly.

And I guess you should think of good module/package/deployment system, because last time I used Smalltalk (it was Dolphin98) it lacked one.

Ehud Lamm - Re: The Case for First Class Messages

6/5/2004; 3:57:11 AM (reads: 649, responses: 0)

Indeed. When PL folks talk about types we think about static properties. It's an unfortunate historical accident that the terms dynamic typing (and/or checking) and latent typing employ the word "typing". This causes endless debates...

A type system connects the static universe of types and the dynamic universe of run time values.

Stuart Allie - Re: The Case for First Class Messages

6/7/2004; 5:31:26 PM (reads: 567, responses: 0)

To quote "The Princess Bride" - "You keep using that word. I do not think it means what you think it means."

Peter wrote: "Type Erasure Going from typed variable slots to untyped variable slots the information available to the compiler at compile time changes. The type of each variable is no longer available at compile time. Actually in some cases it can be determined by tracing the program flow pathways at compile time."

This is incorrect. "Type Erasure" has a particular meaning and it is not what you describe here. Stating that the type is "not available at compile time" is exactly the opposite of what type erasure means.

From http://c2.com/cgi-bin/wiki?TypeErasure "In languages without type subsumption (procedural and functional languages with StaticTyping), all terms are ideally typed (see CompileTimeTypingProblem) and each term has a unique type - in other words, all type information is known by the compiler, and the execution of the program won't change that. In these cases, there is no reason to keep the type information around for the runtime system; so it generally isn't generated. This is type erasure. "

Peter wrote: "As a result of the information loss in both typed and untyped systems neither can be a sub set or super set of the other. A more accurate model would be two mainly disjointed sets of capabilities and features that overlap, to a some degree, with a partial common set of some capabilities."

This is also incorrect. There is no loss of information going from untyped to typed. How can there be? You are adding type information and removing nothing.

Peter wrote: In statically compiled languages the restrictions are hard coded everywhere the "type" is used which produces program rigidity. Too much rigidity and programs become brittle, less general, and difficult to evolve and manage.

This is an opinion that you have stated repeatedly with no objective support for it. To illustrate how meaningless this argument is, I could turn it around like this: "In dynamically typed languages, the lack of compile-time type safety produces program fragility. Without compile-time guarantees, the run-time behaviour is unreliable. Too much dynamicity and programs become fragile, unreliable, and difficult to evolve and manage."

Now, personally, when I don't care if a program is fragile, (because it's just a quick hack for my own use) I'll use a dynamically typed language. But if I care about the long term use of the program, iuts reliability, and its correctness, I insist on static typing. To me, this is flexibility, as it allows me to change the program in consistant ways, knowing that the type checker will remind me of a wide range of silly errors I might make.

You've written about types "constraining" programs. The implication is that it is a bad thing. In a strong statically typed language those constraints mean you have a guarantee that you will not get a type error at run time. In a dynamically typed language you have no such guarantee. I really don't see how such a constraint can be a bad thing. I'd much prefer a compile-time error over a run-time one. The constraints are present in a dynamically types language - you cant send a message to an object that doesn't understand it and expect sensible behaviour. Smalltalk programs are no less "constrained" than haskell programs.

Peter William Lount - Re: The Case for First Class Messages

6/8/2004; 9:31:11 AM (reads: 509, responses: 1)

Dominic: Type inference is typically a static operation, which has nothing to do with the "recovery" of type information at run-time. As the definitions and usages from my prior post demonstrate the term has accepted wider applicability than the way that you typically use it. I'm not the one making up this defintion! For clarity in this thread I'll try to use "dynamic type inference" and "static type inference" to distingish between them. However, so far it's should have been clear from the surrounding context.

Ehud: Indeed. When PL folks talk about types we think about static properties.

In the Smalltalk community the usage of the word type often refers to the class of the object. So the usage is dependent upon the context and the community. The term is not exclusive to static types. This is similar to the use of "object" when referring to an "object instance". It's just easier to say it that way.

Ehud: "It's an unfortunate historical accident that the terms dynamic typing (and/or checking) and latent typing employ the word "typing". English is a powerfully flexible language with a long history giving it great breadth and depth of expression and the ability to use the same and different words for many purposes.

Ehud: "A type system connects the static universe of types and the dynamic universe of run time values." Yes that's one of many ways of looking at it.

Google defines type and system.

Cheers!

Peter William Lount - Re: The Case for First Class Messages

6/8/2004; 9:55:37 AM (reads: 491, responses: 0)

Marcin, thank you for your excellent suggestions and comments.

Marcin asked: "Do you want to develop the best Smalltalk?"

I want to do the best that can be done. The full Zoku system is a variant of Smalltalk that has distinct syntax that's based on Smalltalk's. It is influenced by other computer languages including Lisp, Forth, Self, Lua, Io, Slate, FlashMX, and many others too numerous to mention. It's quite different from Smalltalk in many, hopefully, powerful ways, but almost all Smalltalkers will recognize it and will likely be able to start using it very quickly. (I've bounced it off of some and it's been a big hit so far). Zoku also understands Smalltalk syntax. I want to create the best Zoku that I and the Zoku team possibly can! One of the reason that I"m here asking questions about the benefits of static typing is to find out if there are any that are compelling.

A dynamic run-time type inference tool has a number of nice advantages that I look forward to. I've listed them in an earlier posting in this thread. Also see the comments regarding validations following and in prior posts.

Zoku will sport a full set of objectified first class object run-time "validations", some of which impose constraints. It's possible to view programs, in part, as a network of interconnected validations. What matters to dynamism and quality is where these constraints and validations are placed, how restrictive they are, when they are active, and how they can change on the fly as needed. "Variable and Expression Typing" are very simple forms of validation that are only two of many forms, most of the others being more powerful and flexible. It's about balancing flexibility and safety while reducing rigidity and brittleness; and, at the same time increasing dynamism and safety. All towards maximizing quality. Being dynamic has serious advantages. The goal is rapid and correct construction and on the fly major reconfiguration of applications and Zoku itself.

The Zoku development and run-time environments will sport the state of the art in tools for developers and users alike. Most Smalltalk implementations already have excellent code navigation tools. The future holds newer and better capabilities.

As mentioned in an earlier post, Zoku's syntax supports a shortcut inline message syntax for first class message objects. A new form of extensibility has been gained.

The Zoku compiler has many opportunities for widely known and cutting edge optimizations, Your suggestion is a good one.

As a program generation system Zoku requires very powerful and state of the art deployment capabilities.

On the question of "type tags" (I presume you mean on variables and expressions in a way similar to functional languages?) please see my other posts.

The question of "strict software contracts between parts of system" is a much more complex question, than the typed or untyped variables question, with many added complications such as interoperation between various parts of applications, and systems often written in different languages. See Interoperate.org for a nice quote on the interoperation topic. The issues are wide. Zoku has features to address these issues in powerful and flexible ways.

Once again thank you for your insightful thoughts and ideas on how to proceed with dynamic type capabilities for Zoku.

Marc Hamann - Re: The Case for First Class Messages

6/8/2004; 10:40:42 AM (reads: 497, responses: 0)

English is a powerfully flexible language with a long history giving it great breadth and depth of expression and the ability to use the same and different words for many purposes.

This is quite true. However, if you announce to a crowded room that "I'm here for the good crack", the outcome will depend greatly on whether you say it in a pub in Dublin or a slummy bar in Washington DC.

(For those of you who want to Google to understand this, the more normal Irish spelling is "craic". )

Likewise, if you use "type" in a post on LtU in way that makes sense to Smalltalkers, but not to PLT enthusiasts (the target audience here), you are likely to end up in a misunderstanding. (As you may have noticed already. ;-) )

Peter William Lount - Re: The Case for First Class Messages

6/8/2004; 11:55:15 AM (reads: 486, responses: 0)

Please give me a break on the terminology guys. Once it's clear that there are other valid meanings (and they are clarified) just go with it!

Avi Bryant - Re: The Case for First Class Messages

6/8/2004; 5:43:46 PM (reads: 479, responses: 0)

In the Smalltalk community the usage of the word type...

Peter, please don't imply (or think) that you speak for the Smalltalk community as a whole. I consider myself a member of that community, and, frankly, I'd be ashamed if your comments in this thread were thought to be representative of how Smalltalkers think, write, and argue. I'm sorry if that seems harsh, but it needs to be said.

Dominic Fox - Re: The Case for First Class Messages

6/9/2004; 2:41:06 AM (reads: 447, responses: 1)

Please give me a break on the terminology guys.

Precise (and agreed) terminology is generally considered rather indispensable in this racket. Can we have the clarifications up front next time, please?

A quick google turned up some work being done on Dynamic Type Inference in Smalltalk. It appears to rely on comprehensive test coverage of executable code, with method wrappers being used to inspect the messages being passed. The message passing control mechanism is used to gather type information during a "test run", which information once complete (and effectively "static") is then passed to a type inference algorithm.

Quite a clever "lightweight" approach, but I think there may be some problems with the requirement for comprehensive test coverage - what about I/O, etc.? You could argue that assuming that the test coverage is "complete" and that the program is "operational" rather begs the question.

In any case, the authors are good enough to qualify the expression "type inference" with the adjective "dynamic", so we at least get some advance warning that they're talking about something slightly different. Note also that the purpose of the exercise is to support re-engineering of existing code, rather than enhance the expressivity of the language.

Peter William Lount - Re: The Case for First Class Messages

6/9/2004; 12:50:08 PM (reads: 431, responses: 4)

(1) Avi, I have never said that I represent the Smalltalk community. I have never intended to imply it either. My writings and views are my own and are based upon experiences working in the Smalltalk community since the early 1980's.

Let me clarify the statement to read: "In my experience in the Smalltalk community the usage of the word 'type' may refer to a static type or typed variable but it can also mean the generic English meaning. It's meaning depends on the context of it's usage. An example might help: 'What's the type of that variable? Oh, it's an ordered collection. Thanks.' Here type was used loosely which is perfectly fine.". Besides I've used "type/class" in an attempt to clarify in many of the posts.

(2) I'm sorry IF you feel ashamed. Obviously that isn't my intention. Since you've not been specific I have no way of knowing what your feelings have to do with, which makes it difficult to respond.

I'd be willing to listen and respond to your concerns about how I "think, write and argue" via email (peter@smalltalk.org), on the phone, 604-736-2461, or in person. (I believe your based in the Vancouver, BC, Canada area?)

I respect your technical views and I'm willing to hear your personal opinions via a private channel. I just don't feel that an internet based public forum of professional caliber is the appropriate place for personal opinions of others. I extend this offer to any of you.

(3) As to expression of negative personal opinions of others. I've not expressed negative personal opinions about the other people in this thread out of respect and a view that such opinions do not further the discussion in a professional manner. I've asked people to refrain from such opinions as they are not relevant to the questions. It's about a higher standard where the discussion stays focused the on the technical topic rather than diverging into negative personal opinions of others. Is that too much to ask?

(4) Dominic, I fully agree that precision in terminology is indispensable in technical fields. I am constantly learning new terminology and how people use it in their fields. Specialized areas have their own usage's of words and that's fair. Every group has their jargon. I am learning from you and others in this thread. I appreciate your on topic comments.

I will attempt to clarify. Actually I was attempting to do that, repeatedly. Obviously some readers were not clear. I will endeavor to write with more clarity and as much precision as necessary to covey the meaning I intend.

(5) As for "type erasure", I was unaware of the very specific meaning of this term as used by static compilers. I have now studied it and would agree that my statements that used it were confusing. I stand corrected. It was obviously not my intention to cause any confusion and twist around the definition. I misread the meaning of Frank's usage of the term, due to a lack of knowledge in this highly specific area. I am in the process of reviewing the entire thread to see where I need to clarify what I wrote.

In most cases where I used the term "type erasure" it would have been more clear if I'd said some thing like "removing the type information on the variables from the source code". Type infromation is being erased, it's just that it is being erased from the source code when a program is transformed from a typed to an untyped program. To distinguish what I meant from static compile time type erasure, how does "source code type erasure" fit with people? Does anyone have another short way to say that?

(6) I am currently preparing a reply to the posts that have raised issues with the argument that typed and untyped variables are not sub-universes of each other. While the expression of these points may not have been clearly expressed the substance is valid.

Matt Hellige - Re: The Case for First Class Messages

6/9/2004; 1:31:08 PM (reads: 437, responses: 3)

No, the substance is not valid. Programs without static types (or, more precisely, programs where all values are of a single, universal type) are a subset of programs with a more sophisticated static type system.

This is a fact. It can be demonstrated or conceptualized in any number of ways. I'd suggest that you not post on this topic again until you've convinced yourself that it's true.

Any "untyped" program can be embedded rather trivially in a typed program. (And we're not even into "Turing tarpit" or "universal simulation" territory here. I don't mean only that one can write a Scheme interpreter in Haskell, although of course one could.)

I'm not sure how to make this any clearer, but in this case, your assertions are simply wrong. This doesn't mean, necessarily, that the extra expressiveness of static typing adds any value. That's a different question.

Finally, criticism about your writing style and grammer are accurate and to the point. I find your posts frustrating and nearly unreadable. If you want to hold everyone to a "higher professional standard," please try to improve your writing. And frankly (pun intended), I wish you'd give the ad hominem stuff a rest (note the correct spelling of "ad hominem"). I really don't want to get into a pragmatics debate, but your constant accusations are tiresome, paranoid and wide of the mark. If the content of a post indicates a flawed understanding of the subject matter, I expect that we'll all continue to say, "Your understanding of this subject is flawed." It's much easier than to say, "The understanding of the subject evidenced by this post appears to me to be flawed" or some even more awkward nonsense. The theory of language that such a distinction is based on is questionable anyway, and at this point your constant criticism of the form of the argument detracts far more from contentful discussion than any of the alleged below-the-belt jabs leveled at you.

Josh Dybnis - Damn! (was Re: The Case for First Class Messages)

6/9/2004; 9:32:09 PM (reads: 424, responses: 2)

I feel compelled to defend Peter. It's clear that he pissed some people off, but the way he is being flamed is just way over the top.

Peter has been very civil in his replies, and consistently turned the other cheek when attacked personally instead of responding in kind. I'm not going to go back and quote past posts, but the content in the response to Peter is vastly outweighed by flames. Now claiming that he is making "paranoid accusations" because he points this out too is just offensive!

There is some valid criticism about his writing style and use of terminology, but the "me too" posts are pointless flames. It's not a matter of being right of wrong. I'm honestly still surprised he's still around. It wouldn't be the first time we've driven someone off this way who has intelligence and sincerity.

I don't think Peter grasps how difficult it is to respond when he doesn't use our accepted terminology. But other than having a grand unified theory of everything he really rates pretty low on the crackpot index. I believe he is writing in earnest and he seems willing to take the time to approach unfamiliar territory. For that alone he should be treated with more consideration. If it isn't worth the effort to decode his posts, a couple lines stating something to that effect or a pointer to an earlier discussion is much better than a page-long flame. Especially if someone else has alreay posted a page-long flame.

Peter: as others may have alluded to there have been many past discussions on static vs. dynamic typing.

Ehud Lamm - Re: Damn! (was Re: The Case for First Class Messages)

6/10/2004; 1:04:01 AM (reads: 405, responses: 0)

I guess it's about time I jumped in. Ad hominem attacks are, of course, not what we wnat to have here. Nor do we want to invent new terminology when a standard one exsits.

I think we've had enough discussions on Peter's writing style, and I suggest we give the subject a rest. I think the point was made many times over, and continuing seems a bit like bullying to me.

On the other hand, type theory is a broad and important subject, with standard terminology and sound theoretical reasoning. Many here spent the time required to get into this field, and are rightly frustrated when someone dismisses off hand valid techincal arguments simply because of lack of relevant knowledge. We try to run a professional, even semi-academic, site, and many of the required references were mentioned here many times. Natrually we don't want to start from scratch everytime typing comes up.

Can we try to narrow the discussion to specific issues which can be described using accepted terminology and techniques and laid to rest (we may uncover a new fundamental question, but I must say that it seems unlikely at the moment)? I can tell eveyone is trying to be civil, but things are starting to get out of hand.

Ehud Lamm - Re: The Case for First Class Messages

6/10/2004; 1:28:44 AM (reads: 387, responses: 1)

Programs without static types (or, more precisely, programs where all values are of a single, universal type) are a subset of programs with a more sophisticated static type system.

This phrasing can be abit confusing, since a specific type system restricts the universe of valid programs. The simplest example:

(+ 1 (if #t 0 #f))

This will work with dynamic/latent and untyped setting (or whatever you want to call it) because the nested if will return a number. However, most type systems (rightly) require if expression to always evaluate to the same type.

So: (a) a untyped language can be easily embedded in a typed language and (b) a type system exists in order to restrict the universe of valid programs. That's why we use type systems, after all..

Sjoerd Visscher - Re: The Case for First Class Messages

6/10/2004; 2:04:59 AM (reads: 388, responses: 2)

Matt: Programs without static types (or, more precisely, programs where all values are of a single, universal type) are a subset of programs with a more sophisticated static type system. This is a fact. It can be demonstrated or conceptualized in any number of ways. I'd suggest that you not post on this topic again until you've convinced yourself that it's true. Any "untyped" program can be embedded rather trivially in a typed program.

I'm sorry but that's not a convincing argument.

Frank defined the universal type like type t = String of string | Int of int | Bool of bool

In effect that's tupling the type info with the value, just as dynamically typed languages do.

To turn this around, I can similarly create typed variables in a dynamically typed language by tupling type with a slot, just as statically typed languages do.

So any typed program can rather trivially be embedded in an untyped program.

Now I may be just as wrong with my terminology as Peter is, so I might get flamed at just as he is.

But the easiest thing to do to prove me wrong is to show a typed program which you think cannot trivially embedded in an untyped program.

Ehud Lamm - Re: The Case for First Class Messages

6/10/2004; 4:58:56 AM (reads: 379, responses: 0)

To turn this around, I can similarly create typed variables in a dynamically typed language by tupling type with a slot, just as statically typed languages do.

This won't work, of course, because you lose the static checking (dynamically typed languages don't do static checks)...

Sjoerd Visscher - Re: The Case for First Class Messages

6/10/2004; 7:24:05 AM (reads: 369, responses: 5)

This won't work, of course, because you lose the static checking (dynamically typed languages don't do static checks)...

But at least the program runs like it should.

On the other hand, static checking cannot support (by design) programmatically introducing new names (for variables, or functions, or methods, maybe types.)

F.e. most dynamically typed languages have XML libraries that support code like this:

doc = loadXMLFile("...") print(doc.html.head.title)

html, head and title are names that are dynamically introduced by the loadXMLFile function.

Code like that needs considerable rewriting to convert it to a statically typed language. I would not call that a "subset" anymore.

Ehud Lamm - Re: The Case for First Class Messages

6/10/2004; 7:30:42 AM (reads: 368, responses: 0)

See what I wrote earlier about typed languages vs. typed style.

Ehud Lamm - Re: The Case for First Class Messages

6/10/2004; 7:31:12 AM (reads: 368, responses: 0)

See what I wrote earlier about typed languages vs. typed style.

Frank Atanassow - Curry vs. Church

6/10/2004; 7:47:21 AM (reads: 372, responses: 0)

Ehud: This phrasing can be abit confusing, since a specific type system restricts the universe of valid programs.

and: a type system exists in order to restrict the universe of valid programs

I disagree.

There are two kinds of type systems.

In one kind (let's call it descriptive or Curry-style), one assumes a collection (universe) of untyped programs, and the types index subcollections of the universe. This is like in some versions of set theory where one presumes an `ur-universe' or `ur-elements' or `individuals', and the sets just index certain collections in this universe. In this kind of system, you can say that types restrict programs and that they denote program properties. The type systems of most conventional languages fall, I believe, in this category, as do dependent type systems. So do, I think, systems in which you have (non-trivial) type equality. The reason is that if, say, types denote sets, then two sets are equal iff they have the equal member programs (untyped ones), and to talk about equality of such programs without circularity you need to assume they belong to the same universe.

In the second kind (let's call it prescriptive or Church-style), there is no a priori collection of untyped programs. Instead, each type is its own little universe and every program has a unique (or, depending on details) principal-or-minimal-or-something type. There is no notion of an untyped program. You cannot talk about a program without talking about its type, because the type comes logically before the program; it is a question of well-foundedness. In this sort of system, it makes no sense to say types restrict programs; rather, if anything, they enable them. I think Haskell and ML fall roughly into this class of type system.

Furthermore, you cannot talk about non-trivial type equality in such a system, because (again assuming types are sets), you cannot even compare programs of different types, because they do not belong to the same set. (Remember equality is a binary relation on a set.) Therefore, type systems with non-trivial type equality are never prescriptive. (To make up for this, I believe you need to treat more general notions of equivalence like type isomorphy.)

Also, in such a system, a program is not merely a term, nor is it a `term-in-context'; it is a typing derivation, and this includes all the types, environments, subterms, as well as the tree structure. Of course, we do not write the programs this way: we only write the terms. The reason we can do this is that in most prescriptive type systems which are used as programming languages satisfy one or more of these properties: 1) the unique derivation property, i.e., if a term has a derivation, it has a exactly one; 2) coherence, i.e., if a term has multiple derivations (so 1 does not hold) then they are assigned the same semantics; 3) type inference, i.e., given an environment and a term we can infer its type (and typically one or more of its derivations).

In summary, there are both `program-restricting' and `program-enabling' type systems.

Daniel Yokomizo - Re: The Case for First Class Messages

6/10/2004; 8:36:43 AM (reads: 355, responses: 0)

But the easiest thing to do to prove me wrong is to show a typed program which you think cannot trivially embedded in an untyped program.


show . (/2) . read

Usually any polymorphic values need a static type system to be verified. Another example would be a collection hierarchy:


class Collection c a where
    cmap :: (Collection c' a') => (a -> a') -> c a -> c' a'

data Set a = ...


instance (Eq a) => Collection Set a where
    map = ...


instance Collection [a] a where
    map = ...


cardinality :: Set a -> Int


employees :: [Employee]
familyName :: Employee -> String

countFamilies = cardinality . map familyName

Marc Hamann - Re: The Case for First Class Messages

6/10/2004; 8:39:24 AM (reads: 360, responses: 2)

Ehud: This won't work, of course, because you lose the static checking (dynamically typed languages don't do static checks)...

Sjoerd: But at least the program runs like it should.

I think this latter comment underlines a distinction that is often missing from these type discussions: static types are constraints on the representations of programs, not on programs themselves.

Types add semantic information about the representation (loosely speaking, the "source code") that make it possible to reason about the program in ways that are not possible without this information.

This is analogous to the difference between raw assembly and a symbolic assembler. The same register or memory address can be used to store radically different information at different times in a program. However, in a symbolic assembler (or a "higher level" language) I can alias the same low-level location with different names to make clear the different uses and meanings of the contents.

This information provides a representation of the program that makes it easier for me (or an analysis program) to reason about the intent of the program, but it is lost upon assembly and has no direct effect on the actual computations performed by the program.

Unless you store the names in the program, you can't reconstruct this information from the machine code.

Likewise there is meta-computational information about the representation of the program that is provided by types that is NOT reconstructable from the executing program (unless of course it is preserved in some way in the machine code representation).

Frank Atanassow - Re: The Case for First Class Messages

6/10/2004; 8:40:10 AM (reads: 354, responses: 0)

Sjoerd: In effect that's tupling the type info with the value, just as dynamically typed languages do.

And that was precisely the objective, wasn't it?

But at least the program runs like it should.

Of course. But what is your point? I don't think I ever claimed that lack of static typing (with recursion!) limits the dynamic semantics of programs, but rather their static semantics.

On the other hand, static checking cannot support (by design) programmatically introducing new names (for variables, or functions, or methods, maybe types.)

This feature (?) is orthogonal to typing and it breaks lexical scoping.

In the untyped lambda-calculus, there is no way to introduce new names dynamically, yet it's certainly an untyped language. I have always been implicitly comparing languages that are comparable, meaning that they differ only in whether typing is static or `dynamic'. This means in particular that I assume both languages satisfy some pretty mild premises like lexical scoping.

If you want to start abandoning such premises, then we are getting into another discussion. In your quest to eliminate bondage and discipline, you can go in that direction as far as you please.

For example, why don't you criticize the fact that Javascript forces you to follow a syntax? From your point of view, this must be intolerable! After all, every string can be assigned a `sensible' dynamic semantics. Just interpret it as machine code, and trap all the errors. (You will say: machine code is not expressive enough, no HOFs, etc. I reply: Fine, interpret it as a LISP machine's machine code.) The fact that Javascript statically rejects some programs, just because they don't match its notion of what a `well-formed' program looks like should, by your lights, be pretty galling. And, according to the sort of arguments DT advocates make, Javascript is thereby not letting you write your program in `the most natural way'.

This way lies madness and hypocrisy.

Now, (completely off the cuff) it does seem to me that lexical scoping is intrinsic to the idea of static typing, so much so that if we start talking about something which resembles static typing in a language without lexical scoping, I think I might prefer not to call it `static typing' but rather something else. However, I'm not as sure as you seem to be that they are completely incompatible; after all, there are languages like OCaml and Alice which support dynamic loading.

OTOH, I admit to being leery of anything which breaks lexical scoping, which is such a desirable property, when I can get the same effect in other ways. For example, in this case, I could have the loadXMLFile procedure return a dictionary of XML elements, and define a little combinator language which lets me index into the file. Or, indeed, I could just use strings for the element names. This would give me a run-time error if the string is not a real element name but, after all, that's no worse than what would happen in an untyped language, right?

Alternately, maybe I could extend the universal datatype with a Var case for variables and add dynamic variable lookup environment. Then it basically turns into an interpreter, but then (again) this is basically what an implementation of an untyped language has to do, right?

Matt Hellige - Re: Damn! (was Re: The Case for First Class Messages)

6/10/2004; 8:48:23 AM (reads: 361, responses: 0)

I feel compelled to defend Peter. It's clear that he pissed some people off, but the way he is being flamed is just way over the top.

Of course you're right. I apologize for the undue harshness. Obviously I've just found this extremely frustrating.

Frank Atanassow - Re: The Case for First Class Messages

6/10/2004; 8:52:42 AM (reads: 353, responses: 1)

Marc: static types are constraints on the representations of programs, not on programs themselves.

First, some terminology. I think you are trying to distinguish between syntax (`representation of a program') and semantics (`programs themselves'). But this is bad terminology. A `program' is always a syntactic object, a representation of (sign for) its semantics. You should say: static types are constraints on programs, not on the (dynamic) semantics of programs. (If you regard the dynamic semantics of a program simply as a numeric function, like Church, then this is obvious, again because, by definition, the semantics of a programming language, statically typed or otherwise, must include all partial recursive functions. Of course, there are deeper and more useful notions of dynamic semantics than this...)

However, as I wrote above, what you claim only holds in a Curry-style type system, where the ontology of programs logically precedes that of types, not in Church-style type systems, where it is the other way around.

Peter William Lount - Re: The Case for First Class Messages

6/10/2004; 9:09:50 AM (reads: 335, responses: 0)

(7) I'll have to disagree with you Matt, the substance of my view is valid and a nice way to model what is going on. I've obviously not communicated what I'm attempting to say. Let's try again.

(8) Yes, as you say, one view is Programs without static types (or, more precisely, programs where all values are of a single, universal type) are a subset of programs with a more sophisticated static type system.. I wouldn't go so far as to call it a fact. it's an interpretation or point of view that some have found useful. Calling a static type system "more sophisticated" is a big stretch. I have not found it useful, just another unnecessary detail in a pile of too many details to deal with, many of which are significantly more important. Indeed I've noticed that the view has a few serious problems and limitations that prevent it from accurately modeling the complexities of the reality of language implementations.

(8b) As mentioned in an earlier post I'm not talking about whole type systems, as it can be said that classes are a form of a type or have type information, I'm talking about the issues arrising from typed or untyped variables. However, since the devil is in the details, these issues impact the nature of the representation of entire type systems and how they organize types. The macro must take into account the micro (details).

(8c) All "logical levels" must be accurately modeled and consistent with the other levels otherwise you have major "distortions" when moving between logical levels (chunking up to the abstract/general or chunking down to the concrete/specific) of the representational model.

(9) I have a different point of view where the concepts of static typed variables and dynamic untyped variables are partially overlapping sets since some of their information, features and capabilities are mutually exclusive. This means that neither is a sub universe of the other, but partially overlapping universes. For example statically typed variables have a number of performance optimizations that seem to be unavailable to be applied to untyped variables. Dynamically typed variables have freedom from the static restrictions and much can be done dynamically at run-time that can't be done in static systems when multiple static types are used. These features, capabilities and some of the information of each are mutually exclusive, thus the partial overlap. This demonstrates that this different point of view is valid. This view is a more flexible way of looking at things and more accurately models what is really goning on in actual language implementations. As the expression goes "there is more than one way to skin a cat".

(10) As to your suggestion regarding not posting, please see point (6) from my prior post. The above paragraph is a summary of the my view.

(11) My writing would be shorter, have more clarity, fewer spelling and grammar mistakes if I had more time to write. Please see point (4) of my prior post. Also, some of my usage isn't standard on purpose and, as such, I often use less common definitions of words. If in doubt ask.

(12) Please ask specific questions about anything that you didn't understand, that you find "frustrating" or that you consider "nearly unreadable". I will do my best to clarify.

(13) I checked all the instances of "ad hominem" and they are spelled the way that you and the dictionary recommends. In which post(s) is it spelled incorrectly?

(14) I would agree that "criticism about ... writing style and grammar", if relevant is fine. It's best when the critism specifically highlights relevant areas that need calarification, this gives the receiver of your criticism a better chance of accepting it and doing something about it. I find that pointing out someone's grammar or spelling mistakes a distraction unless it makes a difference to the person or more relevantly to the discussion. Oh, here comes a distraction; by the way "grammer" is spelled "grammar". My point is that none of us is perfect and that the purpose of communicating is to convey meaning and not to have the communication be perfectly written or said. At least that's the view of someone I've taken lessions from who speaks over fourty human languages fluently. I find it much easier to simply ask for a clarification of anything that doesn't make sense to me rather than criticize someone on grammar.

(14b) Last summer I was reading a technical paper that I found published on the net. I sent the author a series of emails asking for clarification about each part I didn't understand. After a number of emails back and forth he made eight clarifications to the published paper. No muss, no fuss, no criticism, no personal attacks. Simply clarity.

(15) I've only point out the ad hominem remarks since people keep making them, more often than not without asking any questions that might remove their concerns. As you know by now, I consider those kinds of remarks off topic and unprofessional. I feel that they need to be addressed to keep the discussion professional. I suppose that we'll have to agree to disagree on this point.

(16) Now, back to the main questions, what are the benefits of messages as first class objects, and what are the compelling benefits of static typed variables?

(17) (Lambda was down lastnight and I could not post this earlier, and now I see some additional posts that seem excellent. Please give me a chance to catch up. Thanks in advance.)

Sjoerd Visscher - Re: The Case for First Class Messages

6/10/2004; 10:05:51 AM (reads: 327, responses: 1)

Alternately, maybe I could extend the universal datatype with a Var case for variables and add dynamic variable lookup environment. Then it basically turns into an interpreter, but then (again) this is basically what an implementation of an untyped language has to do, right?

Any sufficiently complicated statically typed program contains an ad hoc informally-specified bug-ridden slow implementation of half of a dynamically typed language?

Frank Atanassow - Re: The Case for First Class Messages

6/10/2004; 10:37:23 AM (reads: 322, responses: 0)

Sjoerd: Any sufficiently complicated statically typed program contains an ad hoc informally-specified bug-ridden slow implementation of half of a dynamically typed language?

What a stunning knee-jerk reaction! :) I threw out a number of ideas in that post; do you mean to discredit them all by attacking this one? But, OK, I'll bite.

Yes, what you say is accurate.

However, `any sufficiently complicated statically typed [language]' also contains a formally specified, correct, fast implementation of a full untyped language. Let's consider your choice of adjectives in turn.

ad hoc: What makes Perl or Python less ad hoc than an embedded version of them? Because the embedded version is created on the spot? Put it in a library, then. A library component is likely to be well-developed, well-tested and reused.
informally-specified: So, you are suggesting that Perl, Python and Ruby have formal specifications? :) Anyway, how is this a property of the programming language? Whether you specify it or not is up to you. And by implementing it in a statically typed language, you make it easier to develop a specification anyway. Personally, if I were to stick such an embedding in a library, I would probably choose the untyped lambda-calculus, and that already comes with a formal specification, doesn't it?
bug-ridden: Why does it necessarily contain bugs? Anyway, if we go by line count, it takes many, many fewer lines to implement when you embed it than it does to, say, write it from scratch in C, as you would have to do to avoid the problem of being...
slow: I would be willing to wager that an embedded language is as fast as, or faster than, the usual naive implementations (which do no run-time optimization) which most people use for daily tasks. In addition, it is easier to extend with new primitives from the statically typed portion of the language. (Besides, surely DT advocates don't care about speed... :) And even then, there's no reason you can't do run-time optimizations in an embedded language.
half: Why half? Because it's missing the syntax of a stand-alone language? (This could be remedied with macros.) Or because it's developed on the spot? (Again, it could be a library component.)

Lastly, I would add that, even if it were necessarily `ad hoc informally-specified bug-ridden [and] slow', why should I care? I don't intend to do any significant amount of programming in an untyped fragment of a typed language.

Marc Hamann - Re: The Case for First Class Messages

6/10/2004; 11:39:39 AM (reads: 321, responses: 0)

I think you are trying to distinguish between syntax (`representation of a program') and semantics (`programs themselves').

I understand the formal distinctions you are making with the Church/Curry formalisms, but I'm not getting at a syntax/semantic distinction.

In fact, I realize that I have done a "bad thing" and introduced a different level of semantics from the normal PLT notion of semantics without flagging it.

For lack of a better name, I think I can call it "intentional semantics", and what I mean by it is the semantics of "what is going on" in the program as viewed by the programmer.

So, as a simple example, there are many syntactic ways to write a function that converts from Fahrenheit to Celcius. From a formal PL semantics point of view, you simply have a function that for a given Real returns another given Real.

However, from an intentional semantic point of view, this only makes sense if the Real you feed it represents a Fahrenheit temperature and you interpret the result as a Celsius temperature value.

This can easily be encoded through the use of type, and I think it is one of the unsung utilites of static typing that I hope "dynamicists" can appreciate.

Isaac Gouy - Re: The Case for First Class Messages

6/10/2004; 11:49:12 AM (reads: 315, responses: 0)

"Any sufficiently complicated statically typed program contains an ad hoc informally-specified bug-ridden slow implementation of half of a dynamically typed language?"

Discussion groups have taught me to invert this kind of statement (or replace statically/dynamically with green/red) just to explore how much of the meaning comes from the statement, and how much extra we bring to the statement with our assumptions and biases:

"Any sufficiently complicated dynamically typed program contains an ad hoc informally-specified bug-ridden slow implementation of half of a statically typed language?"

Isaac Gouy - Re: The Case for First Class Messages

6/10/2004; 1:38:01 PM (reads: 296, responses: 0)

Or Concrete-Type Inference

see Chapter 4, "Concrete-Type Inference for Smalltalk"

Applications of Concrete-Type Inference, Peter von der Ahé

Chris Rathman - Re: The Case for First Class Messages

6/10/2004; 2:08:50 PM (reads: 291, responses: 1)

This won't work, of course, because you lose the static checking (dynamically typed languages don't do static checks)...

would you consider that code can be written in a dynamic language that implements a compiler for a static language? (e.g. an ML compiler written in Scheme).

Ehud Lamm - Re: The Case for First Class Messages

6/10/2004; 2:36:04 PM (reads: 289, responses: 0)

This isn't really interesting beacuse itgets us back to the Truing tarpit. When we discuss the issues relevant to this thread we most certainly aren't talking about computational universality (i.e., Turing completeness).

Peter William Lount - Re: The Case for First Class Messages

6/10/2004; 2:51:39 PM (reads: 292, responses: 7)

Type inference is a program analysis to deduce types from partially typed or untyped programs. The most widespread use of type inference is to type check statically typed programs.
Chapter 4 Introduction, "Concrete-Type Inference for Smalltalk", Applications of Concrete-Type Inference, Peter von der Ahé

Thank you for the excellent reference Isaac.

Marcin Stefaniak - Re: The Case for First Class Messages

6/11/2004; 1:58:36 AM (reads: 242, responses: 0)

(16) Now, back to the main questions, what are the benefits of messages as first class objects

The "messages as first class objects" concept is roughly equivalent to what java dynamic proxies provide. These are pretty handy to implement AOP-like features (logging, synchronization, remote call, etc.). And this is nowadays common knowledge (at least in the place I work), so little we need to discuss. It works in practice. Modern general-purpose language should support it.

There is, however, more to this than that - the invocation interception could be used to obtain interesting properties of the concurrency model. I don't know what a programming language is addressing this, but a friend told me so a few months ago. Perhaps you know more details, I'd be delighted to hear.

Dominic Fox - Re: The Case for First Class Messages

6/11/2004; 2:56:51 AM (reads: 239, responses: 6)

I take it it's understood that "Any sufficiently complicated x typed program contains an ad hoc informally-specified bug-ridden slow implementation of half of a y typed language?" is a jocular paraphrase of Greenspun's Tenth Rule of Programming?

Just checking...

I note that the Cartesian Product Algorithm-based concrete-type inference discussed by Peter von der Ahé involves simulated execution "where expressions evaluate to concrete types instead of values". Still not really the same thing as inspecting running code.

I admit I don't understand how these two sentences fit together:

Type inference is a program analysis to deduce types from partially typed or untyped programs.
The most widespread use of type inference is to type check statically typed programs.

Doesn't the second sentence contradict the first?

Also, isn't type inference precisely a program analysis to infer types? Expressions in a Haskell program have types, but they need not (all) be stated by the programmer: the type checker is generally able to infer them. That is inference: based on what I have said about X and Y, it is possible to infer what I must mean by Z (and if I then say something about Z that is inconsistent with this, the type checker will detect the inconsistency). Expressions in a Scheme program do not have types (or rather have only a few primitive types), but we may be able to deduce the types they should have if the program were to be rewritten in a language with a more sophisticated type system. That is, it is possible to deduce some things I might have said in a typed language about X, Y and Z that would be consistent with what I do say about them in an untyped language. Is this a valid distinction?

OT - I find it interesting, incidentally, that this discussion has interleaved issues of pragmatics (consistency, relevancy etc. in argument) with issues of typing (consistency, relevancy etc. in programs). Nobody has quite said that the decision about whether or not to program in a statically typed language is an issue of personal morality...

Ehud Lamm - Re: The Case for First Class Messages

6/11/2004; 3:15:44 AM (reads: 228, responses: 0)

Nobody has quite said that the decision about whether or not to program in a statically typed language is an issue of personal morality... Yet.

Marc Hamann - Re: The Case for First Class Messages

6/11/2004; 5:48:27 AM (reads: 219, responses: 1)

Type inference is a program analysis to deduce types from partially typed or untyped programs. The most widespread use of type inference is to type check statically typed programs. Doesn't the second sentence contradict the first?

I think the problem here is the wording of the first statement. Type inference deduces the implicit type of a program; in a sense the type is "already there", just not explicitly. Type inference will fail if the program doesn't "have type".

This may sound like a mystical property of invisible types, but in fact it is quite a strong constraint on programs (as anyone who has worked with a type-inferring compiler can attest ;-) )

whether or not to program in a statically typed language is an issue of personal morality...

I can hear it now: "You cad... how could you be so typeless?" ;-)

Dominic Fox - Re: The Case for First Class Messages

6/11/2004; 8:01:57 AM (reads: 217, responses: 0)

As I half-feared at the time, it made no sense for me to try to distinguish between "inference" and "deduction": deduction is a mode of inference, and that's all there is to it. I was half-thinking of the distinctions between deduction, induction and abduction, but a miss is as good as a mile.

Peter von der Ahé - Re: The Case for First Class Messages

6/11/2004; 8:54:42 PM (reads: 188, responses: 1)

I admit I don't understand how these two sentences fit together:

Type inference is a program analysis to deduce types from partially typed or untyped programs.
The most widespread use of type inference is to type check statically typed programs.

Doesn't the second sentence contradict the first?

No, I think you read the first sentence as "type inference is a program analysis to deduce types from programs without types". However, most programming languages have types. Actually, I do not think I have even encountered an assembly language that did not have some notion of types, e.g., strings and integers.

Also, some people distinquish between two kinds programming languages: statically typed and dynamically typed. The idea is that the opposite of static is dynamic and programming languages without static type declarations (such as Lisp) are thus dynamically typed. A programming language such as Java has static type declarations, however, not all type checks can be done a compile time, e.g., array store and down-casts.

So is Java statically or dynamically typed? I prefer to say it is both because I think that dynamic types refer to having types available at runtime, as they are in Lisp and Smalltalk.

C is an example of a (weakly) statically typed language without dynamic (runtime) types.

SML is a good example of a statically typed language that relies heavily on type inference to type check but also Java is using a small amount of inference in the upcoming Tiger release.

Disclaimer: I work on the Java compiler at Sun, so I like to mention Java ;-)

Dominic Fox - Re: The Case for First Class Messages

6/12/2004; 3:44:38 AM (reads: 184, responses: 0)

No, I think you read the first sentence as "type inference is a program analysis to deduce types from programs without types".

Understandable, given that the first sentence is "type inference is a program analysis to deduce types from...untyped programs"... ;)

(Added) I think I do understand now what was meant, which is (as I understand it) that type inference deduces types from programs without (or with only partial) explicit type information. Is that about right?

Patrick Logan - Re: The Case for First Class Messages

6/14/2004; 9:09:37 PM (reads: 108, responses: 2)

"Any sufficiently complicated statically typed program contains an ad hoc informally-specified bug-ridden slow implementation of half of a dynamically typed language?"

"Any sufficiently complicated dynamically typed program contains an ad hoc informally-specified bug-ridden slow implementation of half of a statically typed language?"

From 24 years of programming professionally, building large systems in several "static" and "dynamic" languages, and conversing with many programmers with similar experiences, I can say I've come across significantly more evidence of the former than the latter.

Frank Atanassow - Re: The Case for First Class Messages

6/15/2004; 11:39:32 AM (reads: 93, responses: 1)

Patrick: From 24 years of programming professionally, building large systems in several "static" and "dynamic" languages, and conversing with many programmers with similar experiences, I can say I've come across significantly more evidence of the former than the latter.

Update: Knee-jerk reaction deleted.

Naturally. The reason you never see the latter is because dynamic typers are all convinced they are already using the One True Paradigm. Also, it is much harder to implement the latter than the former.

Strictly speaking, it is impossible, actually, but I will let that go. (You have to change your notion of "static semantics" so that "static" means "stage one" of the dynamic semantics.)

Matt Hellige - Re: The Case for First Class Messages

6/16/2004; 7:52:28 AM (reads: 77, responses: 0)

Strictly speaking, it is impossible, actually, but I will let that go. (You have to change your notion of "static semantics" so that "static" means "stage one" of the dynamic semantics.)

I disagree. I can think of a number of examples that do have a true static stage. Here's just one: if I'm performing code generation in advance of final program execution, then presumably a number of errors might be identified at code generation time. Despite the fact that my code generator is a "dynamic" program, and the code generated is a "dynamic" program, the process of code generation itself seems to amount to static analysis.

I suppose you could argue that this is still just "stage one" of the dynamic semantics, but as long as "stage one" can be substantively separated (in time and execution environment) from later stages, I'd say that's a reasonably useful definition of "static". Maybe this is sloppy reasoning. If so, I'd welcome correction.

I've seen a lot of this kind of thing in dynamically typed systems, and it really bears out the basic statement. The static phase of these systems is rarely as robust, as well-designed or as powerful as I'd like (IMHO, of course).

Patrick Logan - Re: The Case for First Class Messages

6/17/2004; 12:15:41 AM (reads: 50, responses: 0)

The reason you never see the latter is because dynamic typers are all convinced they are already using the One True Paradigm.

Wrong assumption. I would estimate 2/3 to 3/4 of the people I've worked with and associated with over the last 20+ years have been "static" proponents. Among the "static" languages, I've built, with teams, large systems in C, C++, Pascal, PL/1, Java, and Mainsail.

In everyone of these systems, we or our predecessors had built significant "property list" and "dynamic dispatching" kinds of mechanisms. Several of them also included dynamic "little language" (even Scheme) interpreters.

Are there better "static" languages now that provide more dynamic features in a safer context? Absolutely. But I find it is undeniable that the quest is to be ever more dynamic.

Do I think Smalltalk and/or Lisp are/is the "One True Paradigm"? Absolutely not. They are the most useful tools *I* know of today. But the functional programming community is making that less so all the time. I am not ready to switch, but I understand why others prefer those languages like Haskell, Clean, SML, OCAML, etc.