archives

Type Nomenclature

After spending some time thinking about the static vs. dynamic debate, I clarified my own thinking on the topic by defining some terms. I propose the following names and definitions for your consideration:
  • Simple Type - The set of values to which an expression may evaluate. The type of '1 + 1' is {2}.
  • Complete Type - the pair containing the set of types to which the free variables in an expression may be bound and the Simple Type of the expression. The Complete Type of 'int f(int x, int y)' is {{int, int}, int}.
  • Logical Type - The intended Complete Type of an expression. This concept is a function of a program's design. The Logical Type of 'int operator /(int, int)' is {{int, int - {0}}, int}, even though its Declared Type is identical to that of f above.
  • Static Type - The actual Complete Type of an expression as implemented in a particular programming language. The Static Type of 'operator /' is identical to the Logical Type in most PLs, because division by 0 is undefined. Note that the Static Type exists whether it is declared manifestly or explicitly defined or not.
  • Dynamic Type - The Complete Type of an expression as determined by a runtime evaluation of the expression. The Dynamic Type is parameterized by a set of argument bindings and evaluations. The Dynamic Type of '!a' is {true} for the evaluation {a = false}. Note that the Dynamic Type of an expression for a given execution of a program may be smaller than the expression's Static Type.
  • Declared Type - The Complete Type of an expression as declared in a specific PL. This corresponds to the type signature of an expression in languages with manifest typing. The declared type is usually bigger than both the Dynamic and Static type.
  • Explicit Type - A Simple Type as defined independently of an expression. Enumerations and user-defined object types are Explicit Types.

So why all the fuss? Well, I realize that PLT has perfectly good terms and definitions for types, but they aren't convenient to use when discussing the static vs. dynamic debate. Also, I think these terms might help to clarify different positions in the debate (though it's possible they only clarify things in my own mind). So, let me explain what I see as the relevance of these terms.

It should be obvious that the goal of a good type system is to enforce the Logical Type of an expression. It should be equally obvious that this goal has a significantly increasing cost with the rising complexity of the expression, all the way up to undecidability. So the compromise of an SC type system (SCTS) is basically the 80/20 rule - do the 20% of the typing work that produces 80% of the benefit. That usually means modelling subtypes with their most general supertype. That is, few to no PLs will statically check the validity of a type Even, because it would impose too great a burden on all expressions involving Even. Rather, the language will usually statically check the supertype of Even and dynamically check the subtype. What this means is that the Static Type of 'Even operator+(Even, Even)' is almost certainly '{{Int, Int}, Int}', even though the Logical and Dynamic Types of the expression are at least 8x smaller. And the Static Type is what the SCTS checks at compile-time.

The fact that the Dynamic Type (and the Logical Type) is often smaller than the Static Type may be what leads the DCTS camp to decide that SCTSs are not really an improvement. On the other hand, contracts can make the Static Type of an expression as small as the Logical Type, though at the cost of runtime checks. The SCTS camp claims that you don't need that level of precision for most tasks, and that even DC coders don't invoke that much precision for most code. By approximating the Logical Type with a suitably simple-to-check supertype, a large number of errors can be eliminated with the help of automated checking from the compiler.

Note that what some people call "latent types" I call "Static Types", because they are just as real and active whether you name them or not. It's just that in DC languages, they are anonymous and implicit. Some people argue that you can create Explicit Types in DCTSs; but those types will not be automatically checked by the compiler, which may or may not be a good thing, depending on the context.

Also, I changed my mind on the meaning of Dynamic Type. I decided that you can't change the type of an expression at runtime, because any information that could be used to make such a change should simply be regarded as an input or free variable of the expression. Therefore, changing the definition of an expression yields a new expression. I think the new definition of Dynamic Type makes more sense and is useful for talking about what actually happens vs. what could theoretically happen.

Now, when it comes to Contracts-as-Type-Enforcers, I maintain my claim that any contract can be checked statically. And that's because the Static Type of an expression is fixed by the definition of the expression itself, and cannot change at runtime. Contracts-as-Assertions are a whole different ballgame, and I have no particular comment on them. However, the cost of converting a contract-declared type into a statically-checked type may be arbitrarily high, which is why contracts are useful. On the other hand, contracts have the flaw that they don't write themselves in a DCTS, whereas they magically do in an SCTS. The fact that they are not perfectly defined in an SCTS should not detract from the fact that they are automatic. In fact, the case should rather be made that runtime contracts ought to be used to *augment* static contracts inserted by the compiler.

Of course, the objection is that sometimes static contracts *overspecify* the type of an expression, and tricks must be performed to expand the type. The reply is that Dynamic types can provide a suitably large type for such instances while still preserving most of the rigor of the static contract system. After that, it boils down to an argument between soft typing and Dynamic. My understanding is that these approaches are the dual of each other. That is, soft typing infers the Static Types in a program that are reasonably small and eliminates the checks for them, leaving the rest of the types to be dynamically checked. That is, softly typed languages *infer the static contracts* and *make explicit the dynamic ones*. Where more static checking is desired, types may be explicitly annotated for better performance. On the other hand, languages with Dynamic *spell out the static contracts* and *infer the dynamic ones* from the uses of the Dynamic type.

It seems to me that the results should be equivalent in the limit, and the only difference is the choice of starting point.

Busy, busy, busy

As you can probably tell from my lack of posts recently, I am extremely busy. This is likely to continue for some time...

The purpose of this message is to remind our editorial team to post something ;-)

I am sure each of you has at least a couple of cool links you can come up with without too much effort...

The Reasoned Schemer

Guess what I stumbled across at my local bookstore?

Previously mentioned on LtU, and now available... When the book was announced, Ehud said:

Authored by two of my favorites, Dan Friedman and Oleg, I have such high expectations, that however great the book is going to be, I am sure to be disappointed...

After working through the first five chapters (and sneaking a look at the implementation at the end), I'm pleased to announce that no one is likely to be disappointed... It's a real tour de force.

As expected, the focus this time is logic programming, in the form of a new set of primitives elegantly implemented around a backtracking monad in Scheme. Of course the format is familiar and comfortable, and of course it's charmingly illustrated by Duane Bibby.

So, get your copy today, and congratulations to the authors on a job well done!

Can one avoid monads?

From an aside in a paper on functional graph algorithms, using monads means you have given up on a functional approach. Unfortunately, the procedural approach might be ugly or confusing and we are left still pining for a functional solution. When and how could we use functional approaches and avoid monadic / procedural? What are the limits of our abilities with functional programming?