Lambda the Ultimate

inactiveTopic Modern Static Typing: Less Code, Better Code
started 6/16/2003; 7:59:17 AM - last post 6/19/2003; 6:27:00 PM
Bryn Keller - Modern Static Typing: Less Code, Better Code  blueArrow
6/16/2003; 7:59:17 AM (reads: 690, responses: 11)
New article on my site (www.xoltar.org):

In a recent piece called Strong Typing vs. Strong Testing, noted programmer and author Bruce Eckel makes an argument that dynamically typed languages such as Python are superior to statically typed languages such as Java and C++. I've done quite a bit of Python and Java programming, and even a little C++, so I can appreciate his position, but I think the conclusion goes too far. Whether Python is more productive than C++ or Java is one thing, whether static typing in general should be abandoned is quite another.

Patrick Logan - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/16/2003; 10:09:41 AM (reads: 710, responses: 1)
"I hope more people will take the step of learning a language with a powerful static type system, like Nice, Haskell, or Ocaml before deciding against the very idea of static typing."

I agree. Here is some perspective though. Consider that current dynamic language know-how started in the 1960s, developed in the 1970s, and left the labs in the 1980s. Use of modern type systems could be on a similar path to widespread adoption. Although since modern hardware is capable of running these type systems pretty well, that path may be shorter. Depends on the extent that path is gated more by cultural issues than technical.

Meanwhile see also...

See the Language of the Year effort at the Pragmatic Programmer's site and corresponding Wiki.

They looked at Haskell in 2002. Not sure what's going on currently. Voting indicates they're looking at Common Lisp.

scruzia - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/16/2003; 12:14:04 PM (reads: 731, responses: 0)
The LotY effort is somewhat scattered this year. Common Lisp, OCaml, Oz, some continuing with Haskell. I thought I'd combine the first two by translating some of Graham's "On Lisp" examples into OCaml + OCamlp4. Haven't gotten very far yet...

Frank Atanassow - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/17/2003; 7:11:10 AM (reads: 634, responses: 2)
A few random remarks.

you can't know your program is correct without testing it

And even by testing it, you cannot (generally) conclude it is correct.

those tests are in effect the specification for the program

The tests are only an approximation to a specification. This is evident from the fact that people will add tests when they discover their program is doing something wrong: they are using a standard above and beyond the test suite to determine program correctness.

Testing is a fundamental requirement, with or without static typing.

I agree.

Again, there's no question that Python is more concise than Java, C++, or C#.

I don't buy this.

When you compare a Python program with a Java program, say, how are you comparing them? Presumably you are doing it by erasing all the types in the Java source, verifying that the dynamic behaviors of the Python program and `erased Java' program are the same, and then looking at the original sources and comparing their lengths or complexity. By this standard, obviously the Python program will come out ahead because erased Java and Python are basically the same language, and erasing can only make a program smaller. This method of comparison is completely biased because you are throwing away part of the program without accounting for its value.

Here is another method of comparison. Instead of erasing the types of the Java source, annotate the Python source with types using some type system analagous to Java's, verify that the two programs' dynamic behaviors are the same, and now look at the originals and ask yourself which is better. In this case, since type annotation is generally not automatable (the exceptions being languages like Haskell and ML with decidable type inference), the Java program will look better, because you have to do extra work on the Python version to make them comparable.

When you compare an untyped and a typed language by erasing the types of programs of the latter, you are destroying information which cannot generally be automatically recovered. The type annotations express invariants of the dynamic behavior of the program; those invariants will continue to hold, of course, if you erase the types (in a language with type-erasure semantics), but you cannot demonstrate that they hold without them.

Put another way, a program in a typed language is a proof that some invariants hold; when you erase the types, you lose the proof, some knowledge of the program behavior. Recovering that knowledge is costly; it requires effort. Maybe you `know' that your program works correctly, but without the proof you cannot demonstrate it. Then users of your program are forced to trust you, rather than some objective standard of program behavior. Less reliance on trust, and more decentralization, becomes more and more important as software becomes more and more ubiquitous and interdependent, and as programmers and users become more and more anonymous and unaccountable.

I suspect that the sheer amount of test code that would be required to duplicate the safety of a type checker is so great that people using dynamically typed languages don't write many of these tests at all.

Yes. Indeed, the amount is usually infinite, and the set of test cases required may not even be recursively enumerable...!

People often object to static type systems saying they only guarantee trivial properties, but hardly anyone observes the flip-side: a test suite cannot even guarantee a trivial set of properties. This is one good argument for using static typing and testing as complements, rather than as alternatives.

He offers this example of how a dynamic language will allow us to write code which works with instances of any class, so long as they support the right methods

The reason Java, etc. do not allow this sort of code is not because they are statically typed, but rather because they do not support structural equivalence of types ("duck typing"). Languages like Haskell and ML support both nominal and structural equivalence (though not IMNSHO in an optimal fashion).

Artist.draw and Gunslinger.draw are clearly not even conceptually related, and our Artist has wandered unwittingly into a gunfight. I considered this problem to be fairly unimportant from a practical standpoint, because it usually doesn't happen very often. Eventually though, I realized that this problem could be solved in the context of a statically typed language - not Java or C++ of course, but a language with a real type system, and if you can solve a problem without much effort, why not do so?

Now we are walking into the domain of LSP. The question is whether the contract for the draw method specifies anything about the behavior (i.e., whether it is vaccuous or not). If it doesn't, then it's OK for our artist to pose as a gunslinger at high noon. If it does, then it may not be. Both situations can arise.

Java uses the fact that classes derive from a common ancestor or implement the same interface as a substitute for a formal contract, because the type system cannot express non-trivial propositions about programs. Personally I think that is almost OK, as long as every programmer shares the same idea about it means, but they don't. Some programmers think that it means there has to be a contract for each method, which is expressed somewhere outside the program, and programs have to respect those contracts. Others think it means that the contract is vaccuous. Either viewpoint is acceptable, but there has to be agreement.

Certainly the ideal situation would be for the contract to be expressible in the language. But I do have to observe that, although what you say about being able to check contract equivalence using static type systems is possible, it isn't possible for interesting contracts in the type systems of conventional languages like Haskell or ML.

I won't comment on how Nice addresses the Cat/Dog issue via multimethods.

The simple fact is, the number of times I require a list of arbitrary objects is exactly zero.

Yeah, me too.

Any time I make a list, it's because I want to do something with the contents, which means I'm going to be storing things that are related in some way, and if I'm going to be working with objects that are conceptually related, why not relate them explicitly with a type, to let the type checker help me write less code and fewer trivial tests?

That's not quite the way I look at it. I'm with you as far as "Any time I make a list, it's because I want to do something with contents." That implies there is a function which operates on every member of the list. The function has to have a domain, which is a type. Hence the members have to all belong to that type.

The question of relating the members by putting them all in the same type or not does not occur for me. I understand that programmers of untyped languages may consider the issue that way, but I don't. To me, the types logically come first, and then the values. There is no choice about `which type to assign to some value'. The reason is that I think about all types as abstract collections, their values all mutually disjoint. So it makes no sense to consider a value as being a member of more than one type: that would break abstraction. At best I can say, `these values of this type behave the same way as those values of that type when observed in this set of contexts'.

[unsafe operations] So if you really need to do something unusual and difficult, a static type checker won't stop you.

You don't even need unsafe operations to make that point, since you can embed any untyped language in a typed programming language by defining a universal domain. IMO, unsafe operations are there mostly to implement primitives for interoperability and so on, not to work around the type system.

Dan Shappir - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/17/2003; 1:49:15 PM (reads: 613, responses: 0)
He offers this example of how a dynamic language will allow us to write code which works with instances of any class, so long as they support the right methods

The reason Java, etc. do not allow this sort of code is not because they are statically typed, but rather because they do not support structural equivalence of types ("duck typing").

Java may (currently) not support structural equivalence of types, but the etc. - C++ - does. C++ generics are all about structural equivalence. And given that Java 1.5 specification includes generics as well, Java will also pass this test.

The simple fact is, the number of times I require a list of arbitrary objects is exactly zero.

Yeah, me too

I also agree, but with the caveat that the language supports tuples. I do need touples, and if the language lacks them, as Java does, I feel tempted to recreate them using lists (fortunately C++ is getting tuples).

Neel Krishnaswami - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/18/2003; 11:51:24 AM (reads: 544, responses: 1)
Hi Frank, I have two comments:

First, a correction:

When you compare a Python program with a Java program, say, how are you comparing them? Presumably you are doing it by erasing all the types in the Java source, verifying that the dynamic behaviors of the Python program and `erased Java' program are the same, and then looking at the original sources and comparing their lengths or complexity. By this standard, obviously the Python program will come out ahead because erased Java and Python are basically the same language, and erasing can only make a program smaller.

Python and Java are quite different languages. Python has higher-order functions, more powerful control-flow features (ie, generators), more powerful reflective capabilities, list comprehensions, and more concise syntax for common protocols such as collection handling and string manipulation. Even if you could add type annotations to Python, Python programs would still be a lot shorter.

Second, a question -- you write:

The question of relating the members by putting them all in the same type or not does not occur for me. I understand that programmers of untyped languages may consider the issue that way, but I don't. To me, the types logically come first, and then the values. There is no choice about `which type to assign to some value'. The reason is that I think about all types as abstract collections, their values all mutually disjoint. So it makes no sense to consider a value as being a member of more than one type: that would break abstraction. At best I can say, `these values of this type behave the same way as those values of that type when observed in this set of contexts'.

How do you think about existential types? In my limited understanding, you create existentials by packing a value with its type and its witness type, and you can use them for things like heterogenous lists.

Frank Atanassow - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/19/2003; 6:34:49 AM (reads: 560, responses: 0)
Python and Java are quite different languages. Python has higher-order functions, more powerful control-flow features (ie, generators), more powerful reflective capabilities, list comprehensions, and more concise syntax for common protocols such as collection handling and string manipulation. Even if you could add type annotations to Python, Python programs would still be a lot shorter.

Java's inner classes can serve as higher-order functions. Generators are a new addition to Python, so I didn't include them but, yes, they are a genuine difference. I am not sure of the differences between Python and Java's reflection capabilities, but I know that the sort of reflection they support is nothing to do with the language per se; they are just libraries which access compiler or interpreter internals. List comprehensions are just syntax. And I don't care about concise or special-cased syntax: I am talking about semantics.

OTOH, Stackless Python and Java are genuinely different.

How do you think about existential types? In my limited understanding, you create existentials by packing a value with its type and its witness type, and you can use them for things like heterogenous lists.

I don't understand the question vis-a-vis the part of my post you quoted. I think about existential types the same way as I think about all other types.

As for heterogeneous lists, lists of existentials are only heterogeneous inasmuch as a list of any sum type is heterogeneous. An existential is, after all, just a kind of infinite sum.

Toby Reyelts - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/19/2003; 10:57:47 AM (reads: 516, responses: 1)
Java may (currently) not support structural equivalence of types, but the etc. - C++ - does. C++ generics are all about structural equivalence. And given that Java 1.5 specification includes generics as well, Java will also pass this test.

Unfortunately untrue for Java Generics which are based on subtype polymorphism. Frankly, I have yet to understand what subtype-based parametric polymorphism provides beyond a little bit of compiler-generated casting.

andrew cooke - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/19/2003; 11:28:27 AM (reads: 538, responses: 0)
fwiw, you can kind-of have structural equivalence by using interfaces all over the place. if every method has its own interface then you can combine those interfaces to define any structure you want.

doesn't work if you don't have access to the source, though (and your fingers get tired from all the typing ;o)

Toby Reyelts - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/19/2003; 6:00:51 PM (reads: 504, responses: 1)
I take "structural-equivalence" to mean that no "naming" is involved. You just can't do this in Java with the support of the type system, because it's type system very explicitly relies on named conformance. If you want structural conformance, ala C++ templates, you have to resort to reflection - which, unfortunately, I've had to do many times. My code then becomes five times longer, less typesafe, and slower, but at least it's better abstracted.

When I talked to Gilad Bracha (Sun's "computational theologist") about supporting structural conformance through some kind of "concepts-interface" (which has been batted around in C++ circles several times), he said that the change was simply too radical for the Java language and the engineers that use that language.

All I know is that I suffer heavily each time I have to explicitly drop to reflection to achieve what could have been done trivially with a C++ template.

andrew cooke - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/19/2003; 6:27:00 PM (reads: 535, responses: 0)
i did say "kind-of". if you've only got a hammer you should be grateful for nails, even if they're only "kind-of" nails. :o)

what kind of code requires this (and can't be done by uniting classes via interfaces)? is it mainly third party classes? i'm curious (i've just grepped through my current code and have only used reflection/dyanmic code when acting on parsed (textual) data that comes from outside - i don't think generics would have helped there anyway).

Bryn Keller - Re: Modern Static Typing: Less Code, Better Code  blueArrow
6/23/2003; 8:44:56 AM (reads: 497, responses: 0)
Sorry for the late reply, I've been on vacation the past few days.

When you compare a Python program with a Java program, say, how are you comparing them? Presumably you are doing it by erasing all the types in the Java source, verifying that the dynamic behaviors of the Python program and `erased Java' program are the same, and then looking at the original sources and comparing their lengths or complexity. By this standard, obviously the Python program will come out ahead because erased Java and Python are basically the same language, and erasing can only make a program smaller. This method of comparison is completely biased because you are throwing away part of the program without accounting for its value.

As Neel mentioned, Python has semantic features beyond those available in Java. Aside from that, I would argue that even in the case of equivalent features, even adding type information to the Python version, the Python version will still usually be more concise. E.g., higher order functions:

someObject.callThis(lambda@int x@int: x * 2)

someObject.callThis(new IntCommand() {
  int execute(int x) {
    return x * 2;
  }  
}

Put another way, a program in a typed language is a proof that some invariants hold; when you erase the types, you lose the proof, some knowledge of the program behavior. Recovering that knowledge is costly; it requires effort. Maybe you `know' that your program works correctly, but without the proof you cannot demonstrate it. Then users of your program are forced to trust you, rather than some objective standard of program behavior. Less reliance on trust, and more decentralization, becomes more and more important as software becomes more and more ubiquitous and interdependent, and as programmers and users become more and more anonymous and unaccountable.

I agree, but this doesn't seem related to my argument that Python is more concise than Java, which I think still holds even if you include type information in the Python programs.

Actually, I don't think making the Python version include all Java's type information is quite fair, given that much of the type annotation that Java requires is not required by Nice. If Nice can do type inference for local variables, why shouldn't Java be penalized for inflicting the burden of type declaration on its users?

The reason Java, etc. do not allow this sort of code is not because they are statically typed, but rather because they do not support structural equivalence of types ("duck typing"). Languages like Haskell and ML support both nominal and structural equivalence (though not IMNSHO in an optimal fashion).

True, but FWIW "duck typing" is readily understood in the Python world, and "structural typing" is not, and "nominal typing" is even less likely to be understood. I probably should have made more of the connection between "duck typing" and the Haskell/ML world, though.

That's not quite the way I look at it. I'm with you as far as "Any time I make a list, it's because I want to do something with contents." That implies there is a function which operates on every member of the list. The function has to have a domain, which is a type. Hence the members have to all belong to that type.

The question of relating the members by putting them all in the same type or not does not occur for me. I understand that programmers of untyped languages may consider the issue that way, but I don't. To me, the types logically come first, and then the values. There is no choice about `which type to assign to some value'. The reason is that I think about all types as abstract collections, their values all mutually disjoint. So it makes no sense to consider a value as being a member of more than one type: that would break abstraction. At best I can say, `these values of this type behave the same way as those values of that type when observed in this set of contexts'.

I think this is probably the best way to think about it, but I don't think it would sound convincing to dynamic typing folks, who are accustomed to thinking in terms of either message passing or type casing. For instance you might want to make a list of either integers or 2-tuples of integers, and perform some operation on each of them. In Haskell you might express this list as one of these:

list1 :: [Either Int (Int,Int)]

data List2Item = forall x. Item x (x->foo) list2 :: [List2Item]

I think many dynamic typing people would be unhappy with these solutions because they see the extra constructors as pure overhead. So much "easier" to just stuff the list with Ints and (Int,Int)s willy-nilly, and use if-tests to choose the correct action:

for thing in list:
  if type(thing) is TupleType:
    x, y = thing
    doSomething(x,y)
  elif type(thing) is IntType:
    doSomething(x, 0)
  else:
    raise RuntimeError, "oops!"

Of course if you can add methods to builtin classes, you might use that mechanism instead of the if-tests, but I think in most OO languages this technique is rarely used unless you control the source code. Where was all this going? Oh yes, just to say that I think your approach is right on, but I think dynamic typing people (and people used to weak type systems like Java's) are used to considering things in terms of their "real" value (i.e., this list contains Ints and (Int,Int)s) and not in terms of function domains (i.e., this list contains things which can be passed to function foo), and I was really trying to speak to dynamic typing people and Java/C++ people in the article.

You don't even need unsafe operations to make that point, since you can embed any untyped language in a typed programming language by defining a universal domain. IMO, unsafe operations are there mostly to implement primitives for interoperability and so on, not to work around the type system.

That's a good point. I didn't present the argument that way because I think people understand casting as a simple escape hatch from the type system, but I think embedding an untyped language in a typed language sounds like a lot of busy-work to a dynamic typing fan, and I was trying to emphasize static typing for reducing work, not increasing it.

Thanks for your comments!