Type Nomenclature

After spending some time thinking about the static vs. dynamic debate, I clarified my own thinking on the topic by defining some terms. I propose the following names and definitions for your consideration:

Simple Type - The set of values to which an expression may evaluate. The type of '1 + 1' is {2}.
Complete Type - the pair containing the set of types to which the free variables in an expression may be bound and the Simple Type of the expression. The Complete Type of 'int f(int x, int y)' is {{int, int}, int}.
Logical Type - The intended Complete Type of an expression. This concept is a function of a program's design. The Logical Type of 'int operator /(int, int)' is {{int, int - {0}}, int}, even though its Declared Type is identical to that of f above.
Static Type - The actual Complete Type of an expression as implemented in a particular programming language. The Static Type of 'operator /' is identical to the Logical Type in most PLs, because division by 0 is undefined. Note that the Static Type exists whether it is declared manifestly or explicitly defined or not.
Dynamic Type - The Complete Type of an expression as determined by a runtime evaluation of the expression. The Dynamic Type is parameterized by a set of argument bindings and evaluations. The Dynamic Type of '!a' is {true} for the evaluation {a = false}. Note that the Dynamic Type of an expression for a given execution of a program may be smaller than the expression's Static Type.
Declared Type - The Complete Type of an expression as declared in a specific PL. This corresponds to the type signature of an expression in languages with manifest typing. The declared type is usually bigger than both the Dynamic and Static type.
Explicit Type - A Simple Type as defined independently of an expression. Enumerations and user-defined object types are Explicit Types.

So why all the fuss? Well, I realize that PLT has perfectly good terms and definitions for types, but they aren't convenient to use when discussing the static vs. dynamic debate. Also, I think these terms might help to clarify different positions in the debate (though it's possible they only clarify things in my own mind). So, let me explain what I see as the relevance of these terms.

It should be obvious that the goal of a good type system is to enforce the Logical Type of an expression. It should be equally obvious that this goal has a significantly increasing cost with the rising complexity of the expression, all the way up to undecidability. So the compromise of an SC type system (SCTS) is basically the 80/20 rule - do the 20% of the typing work that produces 80% of the benefit. That usually means modelling subtypes with their most general supertype. That is, few to no PLs will statically check the validity of a type Even, because it would impose too great a burden on all expressions involving Even. Rather, the language will usually statically check the supertype of Even and dynamically check the subtype. What this means is that the Static Type of 'Even operator+(Even, Even)' is almost certainly '{{Int, Int}, Int}', even though the Logical and Dynamic Types of the expression are at least 8x smaller. And the Static Type is what the SCTS checks at compile-time.

The fact that the Dynamic Type (and the Logical Type) is often smaller than the Static Type may be what leads the DCTS camp to decide that SCTSs are not really an improvement. On the other hand, contracts can make the Static Type of an expression as small as the Logical Type, though at the cost of runtime checks. The SCTS camp claims that you don't need that level of precision for most tasks, and that even DC coders don't invoke that much precision for most code. By approximating the Logical Type with a suitably simple-to-check supertype, a large number of errors can be eliminated with the help of automated checking from the compiler.

Note that what some people call "latent types" I call "Static Types", because they are just as real and active whether you name them or not. It's just that in DC languages, they are anonymous and implicit. Some people argue that you can create Explicit Types in DCTSs; but those types will not be automatically checked by the compiler, which may or may not be a good thing, depending on the context.

Also, I changed my mind on the meaning of Dynamic Type. I decided that you can't change the type of an expression at runtime, because any information that could be used to make such a change should simply be regarded as an input or free variable of the expression. Therefore, changing the definition of an expression yields a new expression. I think the new definition of Dynamic Type makes more sense and is useful for talking about what actually happens vs. what could theoretically happen.

Now, when it comes to Contracts-as-Type-Enforcers, I maintain my claim that any contract can be checked statically. And that's because the Static Type of an expression is fixed by the definition of the expression itself, and cannot change at runtime. Contracts-as-Assertions are a whole different ballgame, and I have no particular comment on them. However, the cost of converting a contract-declared type into a statically-checked type may be arbitrarily high, which is why contracts are useful. On the other hand, contracts have the flaw that they don't write themselves in a DCTS, whereas they magically do in an SCTS. The fact that they are not perfectly defined in an SCTS should not detract from the fact that they are automatic. In fact, the case should rather be made that runtime contracts ought to be used to *augment* static contracts inserted by the compiler.

Of course, the objection is that sometimes static contracts *overspecify* the type of an expression, and tricks must be performed to expand the type. The reply is that Dynamic types can provide a suitably large type for such instances while still preserving most of the rigor of the static contract system. After that, it boils down to an argument between soft typing and Dynamic. My understanding is that these approaches are the dual of each other. That is, soft typing infers the Static Types in a program that are reasonably small and eliminates the checks for them, leaving the rest of the types to be dynamically checked. That is, softly typed languages *infer the static contracts* and *make explicit the dynamic ones*. Where more static checking is desired, types may be explicitly annotated for better performance. On the other hand, languages with Dynamic *spell out the static contracts* and *infer the dynamic ones* from the uses of the Dynamic type.

It seems to me that the results should be equivalent in the limit, and the only difference is the choice of starting point.

Tomayto

Note that what some people call "latent types" I call "Static Types", because they are just as real and active whether you name them or not. It's just that in DC languages, they are anonymous and implicit.

Aside from the long history and body of work behind the term "static type", the reason for making a terminological distinction is that a crucial characteristic of "real" static types is that they have a rigorous, formal definition. In the latent-typed case, there are always multiple possible type schemes that can be assigned to a given program. The types are latent in the sense that it requires an agent such as a programmer or a soft type inferencer to actually define a particular static type scheme for a program. Until that assignment has been done, you can't really call them static types in the same sense as the fully-specified static types in a statically-checked language.

Now, when it comes to Contracts-as-Type-Enforcers, I maintain my claim that any contract can be checked statically. And that's because the Static Type of an expression is fixed by the definition of the expression itself, and cannot change at runtime

IOW, the claim is tautological.

By Anton van Straaten at Tue, 2005-11-08 03:15 | login or register to post comments

Tomahto

Aside from the long history and body of work behind the term "static type", the reason for making a terminological distinction is that a crucial characteristic of "real" static types is that they have a rigorous, formal definition.

But I do make a terminological distinction. What you call a Static Type, I call an Explicit Type. And I think my terms better convey the difference, as I explain below.

In the latent-typed case, there are always multiple possible type schemes that can be assigned to a given program.

In the sense of a typed lambda calculus or some similar formalism, yes. In the sense I have defined it, each expression has a Static Type that can be deterministically computed from the operational semantics of the expression, and that type is unique.

The types are latent in the sense that it requires an agent such as a programmer or a soft type inferencer to actually define a particular static type scheme for a program. Until that assignment has been done, you can't really call them static types in the same sense as the fully-specified static types in a statically-checked language.

And the point of my nomenclature is exactly to point out that each expression has a Logical Type that is the ideal "type scheme", as you say, for that expression. And what real languages do is allow you to define Explicit Types that are generally supersets of these Logical Types. By loosening an expression's enforced type, you gain at least two benefits. One, you make it more likely that the enforced type definition (which I call the Declared Type) can be shared across many expressions in the program. And two, you make checking of the type more computationally tractable (such as not checking a bounded subtype explicitly, or only performing a runtime check). Nonetheless, it is often the case that the type defined by the Declared Type is larger than the Logical Type for the expression in question.

Something that I only alluded to in the previous post is that perhaps we should stop thinking about static typechecking vs. runtime contract enforcement and simply view all typechecking as a type of contract enforcement, with the decision to perform the check being performed at compile time or deferred to runtime, and with the check being manually specified or implicitly generated by the compiler. From this view, a statically checked type system is simply a Contract Generator that allows you to specify a simple set of expression contracts and have them inserted at all the appropriate places (and perhaps a few inappropriate places!) in a program.

Furthermore, just because Explicit Types are defined by a programmer does not by necessity make them any closer to the Logical Types of the expressions which use them. The formality does not guarantee logical fidelity. It only guarantees soundness and uniformity (or, at least, enables them, which has indisputable merit, but let's make the distinction).

What the historical nomenclature fails to address is that every well-formed expression has a definite and unique domain and codomain, and that is not necessarily equivalent to the Declared Type of the expression. This is what I consider to be the truest and most real type of the expression, because it is the type that is defined by the implementation of the expression itself. Whatever it is that the implementation does, that defines what I call the Static Type of the expression. And often times, that type is smaller than the Declared Type, and sometimes larger than the Logical Type of the expression. I don't think there is a convenient way to express this distinction using the historical nomenclature. And the reason I believe this distinction is relevant is exactly because of the static vs. dynamic debate. Declared Types correspond to Explicit Types, and the fact that they are often looser than the Static Types of expressions is exactly the reason why contracts are touted. I think that being able to say that "manually specified type contracts allow one to Declare and define a Static Type that is closer to the Logical Type than what is typically possible with Explicit Types" is important in conveying the DC position.

Now, when it comes to Contracts-as-Type-Enforcers, I maintain my claim that any contract can be checked statically. And that's because the Static Type of an expression is fixed by the definition of the expression itself, and cannot change at runtime
IOW, the claim is tautological.

Sure, but I think it's worth saying anyway, because I think it helps eliminate some artificial lines that are drawn between static and dynamic checking.

By David B. Held at Tue, 2005-11-08 05:28 | login or register to post comments

Nitpick

Ordered pairs: (a,b)
Sets: {a,b}

Also, for some of these could you not just make use of existing mathematical terms like domain, codomain and range of a function?

By Matthew Willson at Tue, 2005-11-08 04:50 | login or register to post comments

Good points

It's a good thing this was just a blog comment and not a peer-reviewed article, huh? ;> Part of the reason I did not appeal to the mathematical definition of function is that it is more rigorously defined than the CS conception, and I didn't want there to be a confusion. An expression has no limitations on, for example, there being at most one image for each element in the domain.

By David B. Held at Tue, 2005-11-08 05:32 | login or register to post comments

TAPL

David;

Have you read Types and Programming Languages by Benjamin Pierce? It's widely considered to be the bible on the subject. It primarily focuses on formal, static type systems; but you probably should read it. The CS community already has some well-accepted jargon for some of the things you describe.

The area where things are a bit looser, of course, is in the area of languages which are refered to variously as "dynamically typed" or "dynamically checked" or "typeless". Terms like "duck typing" seem to have an "I know it when I see it" definition, rather than a formal, accepted one. Often this terminology is argued, often times by parties trying to make a point. Part of the problem is that the dynamically-typed languages which are most popular (Python, Ruby) and the common ancestor of the dynamically-typed OO family (Smalltalk) are all products of industry and/or the open source community--two camps not known for formal rigor. (In defense of both communities--formal rigor is seldom necessary for what they do; just as end-user productivity is not often a primary concern for academia).

A second source of confusion comes from the database community--in particular Darwen and Date, who try and reinvent type theory (poorly) in The Third Manifesto. They do cover many of the issues you raise; and also invent their own terminology for things which already exist. (In defense of D&D, they limit the scope of their writings to the context of relational databases; but there's a whole lot of research into the subject of type theory that they seem completely unaware of--instead, conjuring up a primitive type system from scratch for Tuturial D. While what they suggest is far better than what SQL provides; that's like observing that C has a more powerful type system then BCPL did).

By Scott Johnson at Tue, 2005-11-08 18:39 | login or register to post comments

Changing the topic.

You brought up something that has interested me for a while. I'm trying to work through TAPL, and I've read The Third Manifesto. I agree with you that they seem to be reinventing type theory, but as someone who's still learning type theory, I've looked for specific criticisms of the Third Manifesto's ideas about typing that come from a formal background, as I think it would help me(and others) understand "what's going on", since Darwen and Date are trying to be very rigorous in their approach, I think a criticism of their ideas would be more useful than a criticism of some other existing systems(like dynamic languages).

Er, anyway, just wondering if you could elaborate more, or point to something, or maybe suggest whether it would be a worthwhile topic for someone to figure out and maybe right up something(for say LtU)...

By Matt Estes at Tue, 2005-11-08 19:57 | login or register to post comments

Issues and comments with D&D

In no particular order (and from memory--the version of TTM that I own is the first edition; not sure how much is changed in the second edition, which seems to be harder to find than the first edition. The first edition still seems to be in print--a questionable publishing practice if you ask me...)

* They use the term "OO" through the text to mean not "object-oriented", but "other orthogonal"--meaning domain issues orthogonal to the relational model. Which is fine, but their redefinition of an existing and well-established term (even if it isn't as rigorously defined as they would like) is more confusing than illuminating.

* Speaking of which, they peddle the "OO lacks math" argument a bit too much for my taste. While the comparitive lack of formal footing for OO (until recently at least) is an issue; to me it's a sign that more rigorous formal study is needed on the subject; NOT a reason to dismiss OO.

* The way they deal with accessors/projection functions is syntactially bizarre... this is a minor quibble.

* They permit subtyping, but only Liskov-subtyping (i.e. restrictions). OO-style subtyping, wherein one creates different types which have SOME invariants and behaviors in common but others which are intentionally different, seems to not be permitted in their model.

* Seemingly without knowing it, they have rediscovered the benefit of linear types. The "four out of five rule" is essentially an observation that some operations are valid (and this can be proven statically) only in the absence of aliasing. The relational model, wherein domain objects are only held in explicitly named relvars, essentially makes the domain objects linear (aliasing is handled by the containing relvar). Linear typing (and other substructural type systems) have been known for quite a while.

While Date and Darwen are to be commended for their rigor--they exclude many potentially useful things on the grounds that the correctness and merit of these things has not been demonstrated--they seem to be unaware of a great deal of literature on topics outside the area of relational databases.

By Scott Johnson at Tue, 2005-11-08 20:57 | login or register to post comments

Your last point exemplifies w

Your last point exemplifies what bugged me so much about reading it. It really seems like they have a disconnect between the body of knowledge about typing and their own field. Now that I know (for instance) what their "4 out of 5" rule really is, I can go and study up on that...

As an aside, I got the book because I wanted to understand databases. My interest is in building a database system that is well-founded on the right principles, but that involves both typing and the relational model.

Anyway, thanks. This is the sort of stuff I was wanting to hear.

By Matt Estes at Wed, 2005-11-09 04:31 | login or register to post comments

I own it

But I only got through the first six or seven chapters. The problem is that LC is so far removed from any real programming language that it's tedious to relate the results to reality. The formal definition of types you find there strips out everything that isn't academically significant (i.e. relevant to proving progress and preservation). So I feel there is a disconnect between theory and actual language design. Sure, Haskell and ML and whatever may be designed by formal methods, but at the end of the day, the language designers realize that programmers want things that haven't been formalized yet and they throw them in anyway, without developing a complete theory about it. So rigor is good. Proofs are good. But sometimes it's useful to look at what actually exists, acknowledge that it's not theoretically pure along any dimension, and describe it as it is, rather than as it ought to be. For instance, my understanding of the simply-typed LC is that it doesn't really distinguish between the type you ascribe to a function and the *actual* domain/codomain of the function as it's defined. That's because the types in the STLC are what I call Explicit Types, and it isn't important that they happen to be supersets of what I call the Static Types of the functions, even though the Static Types might be more important in an analysis of an actual program. The formalism of the STLC and other academic frameworks is important, but it isn't the whole story. A physicist may be able to describe, in principle, the workings of an electronic circuit on the quantum level. But that information is pretty much useless for designing a real circuit. At some point theory must interface with engineering, and theory cannot always win out because theory necessarily throws away details in the pursuit of conceptual clarity. I was merely trying to find a middle ground between theory and practice that shows that static and dynamic checking are not so different, despite the fact that static checking is heavily studied in academia, and dynamic checking much less so. I'm not trying to reinvent type theory. I was just making some observations about real type systems, and trying to make some connections.

By David B. Held at Tue, 2005-11-08 20:40 | login or register to post comments

Puzzled

The problem is that LC is so far removed from any real programming language that it's tedious to relate the results to reality.

Huh? This is the most bizarre critique I've seen in a while. The reason academics like lambda calculus is not merely that it's simple, but that it's simple and very much like a "real programming language". I would be most interested to hear any specific critiques of lambda calculus in the context of the topics in TAPL that make it a bad model of "real programming languages". I mean, the core of ML is basically a polymorphic call-by-value lambda calculus with Hindley-Milner type inference and side effects. Even granting that ML is mostly used by academics, I bet even you agree that it is a "real programming language".

By Per Vognsen at Tue, 2005-11-08 22:07 | login or register to post comments

Representation, I/O

The problem with LC is that it does not deal with the representation of values, even though this is exactly where the rubber meets the road for real PLs. LC assumes that you can simply create sets of values of any size, including infinite sets. But no real PL allows such a thing, and the true limitations of representations are not only manifest in such issues as overflow, but they are in fact exploited in such things as CRC checks and many encryption algorithms. This is exactly where approximating a physical system with a theoretical one fails. This is also where academic languages fail to gain popularity.

C++ is one of the two most popular languages partly because it is based on C, and partly because it gives equal weight to types and representations. C++ does not pretend that values have arbitrary size and that programmers should be unconcerned with things like integer overflow and signed vs. unsigned arithmetic. While some of these issues might be criticized as unnecessary bit twiddling, they are real issues for software engineers, even if they aren't pressing concerns for type theorists.

When it comes to I/O, there is really little to no theory about how that ought to be addressed in a language (short of isolating it in something like monads) because of how varied I/O support can be from platform to platform and application to application. Yet, I/O is a good deal of what many commercial applications do. Heck, the web browser that I'm using is pretty much all I/O. The only computation it performs is translating the HTML stream to a rendered web page, and the rendering process itself is all I/O. Could you remind me again what LC has to say about I/O?

If all you do is math, LC is a wonderful abstraction. But when math is the minority of what your programs do, LC looks much less relevant to reality.

By David B. Held at Tue, 2005-11-08 23:48 | login or register to post comments

Speaking of arbitrary sizes

C++ does not pretend that values have arbitrary size and that programmers should be unconcerned with things like integer overflow and signed vs. unsigned arithmetic.

Been a while, but last time I dealt with C/C++, the definition of an int wasn't exactly an established number of bits (platform dependent). If one wants consistency in terms of types, it's usually easier to point to the standardization that Java has across platforms (sometimes at the expense of speed).

As far as I/O goes, you seem to equate academic languages with Haskell. There are other academic languages (ML, O'Caml, Scheme, and a multitude of others) that have a quite different model of I/O - they can gracefully handle state as easy (if not easier) than any of the product based languages. There are lots of academic languages with lots of different purposes. At best, your case is an overgeneralized strawman.

Finally, it's a mistake to say that Bell Labs was not an academic based think-tank. C and C++ were very much thought of and implemented by profesors of thinkology. The difference is that they were operating within the constraints of existing software - that is they were incremental implementations of languages.

By Chris Rathman at Wed, 2005-11-09 00:42 | login or register to post comments

Physical representations

Type theory texts don't deal in issues of physical representation because a) it's orthogonal to the theory, and b) it complicates matters for the student. However, rest assured that type theory is not an attempt to sweep such things under the rug.

There is much research on type systems applied to low-level programming languages, i.e. "typed assembler", and other ways for typesafe (and suitable for formal analysis) systems programming. The "sequel" to TAPL, Advanced Topics on Types and Programming Languages contains several chapters on the subject.

In addition, even within the bounds of the STLC; it's not hard to construct types within the LC which are isomorphic to any machine type you can imagine--including fixed-precision integers modulo some power of 2. Textbooks usually don't do this because--again--it's irrelevant to the material which is being taught, and an unnecessary complication. (Of course, some industrial curmudgeons might say the same about the Peano axioms--practicing engineers aren't usually interested in constructing the integers out of sets--we like to assume that they already exist as givens, thankyouverymuch).

And contrary to your claims about infinite sets; via recursive types one can EASILY specify types which have an infinite number of members. Many languages do provide numeric types with unbounded precision (BigInts and such)--bounded only by the machine's memory.

Regarding I/O--the lambda calculus (and type theory) have been quite useful at modeling I/O. The process calculii often use results from type theory; and many I/O mechanisms can be modelled using streams. More sophisticated I/O devices can also be simulated within the lambda calculus. Often times, the type theory for I/O devices is trivial (a file on Unix is an array of bytes); the type-complicated part is the mapping of complex objects into bytestreams (serialization and deserialization).

By Scott Johnson at Wed, 2005-11-09 00:47 | login or register to post comments

Moof

LC assumes that you can simply create sets of values of any size, including infinite sets.

I have no idea what you are referring to here. When I look at the definition of lambda calculus syntax and the basic concepts such as alpha conversion and beta reduction, I nowhere see where I am required to "create sets of values of any size, including infinite sets". I don't even know what exactly you mean by "create". The set of lambda terms is infinite, sure, but I don't see how that could possibly be a problem? There's an infinite number of well-formed C++ programs too.

I'll try to respond to the rest of your points once you've cleared up the above.

By Per Vognsen at Wed, 2005-11-09 04:14 | login or register to post comments

Wash, rinse, repeat

The problem with LC is that it does not deal with the representation of values, even though this is exactly where the rubber meets the road for real PLs. LC assumes that you can simply create sets of values of any size, including infinite sets. But no real PL allows such a thing, and the true limitations of representations are not only manifest in such issues as overflow, but they are in fact exploited in such things as CRC checks and many encryption algorithms. This is exactly where approximating a physical system with a theoretical one fails.

Take a look at Appel's book, Compiling with Continuations, which (among other things) goes into detail about bit-level issues of mapping an LC-based language onto a native code compiler. The book describes the implementation of an SML compiler, based on real-world experience with the SML/NJ compiler.

The fact that LC itself doesn't deal with representation of values doesn't prevent languages based on it from doing so. For example, SML has types such as Int32, Word32, Word8. There's no problem with including such types in a typed lambda calculus and giving them the desired representation.

Another important representation issue is how to represent programs in native code. But LC has a very strong story in that area too. The "compiling with continuations" approach which Appel describes uses a transformation in which source programs are transformed to a core lambda language which corresponds quite closely to native code, after which the biggest job is basically mapping variables bound by lambda, to machine registers. Kelsey's paper Realistic Compilation by Program Transformation summarizes this approach (it ought to be renamed "Lambda, the Ultimate Machine Code", to underscore the point).

Basically, these are all language implementation issues, which don't have much relevance to the theory of types, or to LC as a formalism. Look in the appropriate places, and you'll find that such issues are addressed, very well.

Could you remind me again what LC has to say about I/O?
If all you do is math, LC is a wonderful abstraction. But when math is the minority of what your programs do, LC looks much less relevant to reality.

All it takes to address the issues you raise, without requiring the rigor introduced by monads, is to add support for side effects. That's exactly what languages like SML, OCaml, and Scheme do. Also note that Erlang, which is LC-based and pretty "real", avoids conventional side effects within the language, but if it didn't support I/O, a lot of telecom systems wouldn't work.

By Anton van Straaten at Wed, 2005-11-09 05:48 | login or register to post comments

Oversimplified web browser.

Yet, I/O is a good deal of what many commercial applications do.

Agreed. Not only commercial, but many practically useful ones as well.

Heck, the web browser that I'm using is pretty much all I/O. The only computation it performs is translating the HTML stream to a rendered web page, and the rendering process itself is all I/O.

I think you are being a bit unfair to your web browser here. A compiler also normally takes input from a file and outputs its result to a file, yet I would not describe the compiler as "pretty much I/O".

By Peter A Jonsson at Wed, 2005-11-09 08:06 | login or register to post comments

"Explicit" types vs "Static" types

Some times, the "explicit" type (in your nomenclature) is as good as can be achieved. For example, no type system can prove, in the general case, the absence of division by zero (in programs containing divisions by non-literal expressions). Such is undedideable.

In addition, suppose you have the following code snippet in pseudocode.

a = b + c;
d = 5 / a;

To prove that division by zero does not occur, you'd have to show that a is of type [real_number] - {0}. But that type is not closed under addition--what types must b and c be? Many properties of terms are difficult to encode as types; in this case the "types" of b and c would have to be dependent on each other. Ugh.

At any rate, these issues are well known to type theorists.

One other book I offer for your consideration: ''A Theory of Objects'', by Abadi and Cardelli. This book looks at type theory from an OO perspective, and is an interesting companion to TAPL.

Edit: Use of meta-syntactic in angle brackets confused the site software (which though it was HTML/XML/etc), screwing up the formatting.

By Scott Johnson at Tue, 2005-11-08 22:47 | login or register to post comments

Yes, of course

For example, no type system can prove, in the general case, the absence of division by zero (in programs containing divisions by non-literal expressions). Such is undedideable.

I meant to imply as much. But I disagree with this:

Some times, the "explicit" type (in your nomenclature) is as good as can be achieved.

I would agree with it if you added the caveat: "With static checking." But the point is that with dynamic checking, you can make the Declared Type arbitrarily close to the Static Type.

At any rate, these issues are well known to type theorists.

And yet, they continue to focus on static checking, instead of considering the benefits of a more balanced approach.

One other book I offer for your consideration: ''A Theory of Objects'', by Abadi and Cardelli. This book looks at type theory from an OO perspective, and is an interesting companion to TAPL.

I'll take a look at it, thanks.

By David B. Held at Tue, 2005-11-08 23:53 | login or register to post comments

Dynamic checking vs dynamic typing

Except that "dynamically typed" languages DON'T make type judgements, at least not in the formal sense that statically typed langauges to.

If I write (in Smalltalk)

someObject send: someMessage

The Smalltalk implementation doesn't check to see what class someObject belongs to. All it does is verify that whatever class it is, understands the message send:. That's it. (C++ templates work in a similar fashion; though all evaluation occurs at compile time). At no point in the program's execution does Smalltalk attempt to assign a "type" to a given term, i.e. "someObject is of class X", or "someObject must implement the following set of messages".

Latent typing is very powerful and flexible in many ways; but it doesn't really qualify terms by types at all. PROGRAMMERS may do that in their minds, but the interpreter/VM generally doesn't. (Perhaps some clever optimizing Smalltalk implementations are more sophisticated; but I'm unaware of the current state of the art).

The reason that type theorists focus on static typechecking, is that latent typing is (from a theoretical point of view) largely a solved problem. There are implementation issues to study (how to do dispatch real fast is still a major concern), but verifying that object X has method Y is not a theoretically interesting problem.

You are correct, of course, in that those who take extremist positions in the static/dynamic debate are doing the discipline a disservice. But many type theorists aren't opposed to dynamic typing--it's just not an interesting (from a type-theory point of view) problem.

By Scott Johnson at Wed, 2005-11-09 00:59 | login or register to post comments

Only solved in wetware

The reason that type theorists focus on static typechecking, is that latent typing is (from a theoretical point of view) largely a solved problem.

It's only solved if you accept the status quo of programmers performing type inference and checking "in their minds", but it's not solved if you'd like to see more of what the programmer does in their mind to be understood by the machine, without requiring an invasive global transformation of the code.

And just preemptively, I'll point out that the solution often suggested from the statically typed side is to use a Dynamic/Universal type, but that merely allows latent types to continue to be latent in a statically typed program, and doesn't address any of the more interesting issues.

That's not intended as a critique of traditional PL type theory, which is understandably largely concerned with supporting and exploiting the ability to use mechanical static proofs. Dealing with latent types falls more into the usability/expressiveness side of things, which might explain why it gets less respect. :)

By Anton van Straaten at Wed, 2005-11-09 06:42 | login or register to post comments

Relating to reality

But I only got through the first six or seven chapters. The problem is that LC is so far removed from any real programming language that it's tedious to relate the results to reality.

Note that chapter 18 deals with imperative OO, and chapter 19 with "Featherweight Java".

There's very little on dynamically-checked systems, but still, as a pragmatic issue, I think you'd be better off not trying to redefine well-established terms like "static type".

By Anton van Straaten at Tue, 2005-11-08 23:07 | login or register to post comments

Lazy

Note that chapter 18 deals with imperative OO, and chapter 19 with "Featherweight Java".

I prefer to use a lazy approach to evaluating TAPL. ;> Seriously, though, it's the 10 boring chapters from where I fell asleep to the more interesting parts that made me put the book down. You really have to do the exercises to make progress, and I just don't know if I have that kind of motivation without some arbitrary authority figure threatening my well-being. Also, I'd like to point out that Featherweight Java has very little relation to "Java Compleat" (or should I say "Heavyweight Java"?). And one chapter on mutable state is hardly enough to make me stand up and take notice, especially when probably 90% of the world's programmers use a language that emphasizes mutable state.

There's very little on dynamically-checked systems, but still, as a pragmatic issue, I think you'd be better off not trying to redefine well-established terms like "static type".

I wouldn't, if I were writing for an academic journal. If you like, I will change my term to "Real Type". Would that be an improvement?

By David B. Held at Wed, 2005-11-09 00:01 | login or register to post comments

And one chapter on mutable st

And one chapter on mutable state is hardly enough to make me stand up and take notice, especially when probably 90% of the world's programmers use a language that emphasizes mutable state.

Perhaps it didn't occur to you that the treatment is so short because that's all that's needed at that point? Sure, you can look more into encoding invariants into the type system but then that's beyond TaPL's scope - ATTaPL's chapters on region systems, uniqueness typing and dependant types all contain useful mechanisms for such things. Simple state is covered with the mention of references early on, and again when new features require a slightly more sophisticated type. That covers pretty much everything languages like C++, Java et al can type about mutability (yeah, yeah, I know about volatile et al).

By Philippa Cowderoy at Wed, 2005-11-09 01:51 | login or register to post comments

TaPL reading tips: just skip the proofs

I found many of the proofs too difficult to keep me interested so I started skipping them. Though I was initially hesitant, it soon became a reflex. There's still a lot of interesting stuff in almost every chapter that doesn't depend on knowing the proofs.

By Kannan Goundan at Wed, 2005-11-09 06:28 | login or register to post comments

Seriously, though, it's the

Seriously, though, it's the 10 boring chapters from where I fell asleep to the more interesting parts that made me put the book down.

But then be more careful with drawing conclusions and inventing your own "theory" of things before you properly understand the established one and its underpinnings. If you look at the definition of type system on TAPL page 1, you will note that most things on your list do not qualify as types in that sense.

In particular, types are not just sets of values. First, the set view does not explain the more interesting features of type systems, namely abstract types and the like. And second, non-trivial type systems - particularly polymorphic ones - simply do not have an interpretation in standard set theories, because they become self-referential.

That is, types are formulas in a logic, not sets.

By Andreas Rossberg at Wed, 2005-11-09 11:06 | login or register to post comments

Lambda the Ultimate

User login

Navigation