What is a Type?

After going through both of the Types and Programming Languages books, I am starting to feel confident in my understanding of Type Systems in terms of how they would be implemented. However, I still feel uncomfortable with the theoretical foundations; even after going through the proofs it still seems like a lot of hand-waving when defining what is a type. So to try to find a good mental model of what a type is I thought I would ask LtU: what is a type?

The reason for my uneasiness probably stems from my background programming in both static and dynamic type systems. The static definition of types seems to tend more towards what can be proven at compile time. This definition seems to put more emphasis on creating type systems which are strongly normalizing. However, in dynamic type systems, types are often manipulated as terms. Trying to find a consistent mental model for these two usages of the term (especially after the introduction of subtyping) I am currently stuck with defining them as propositions which map terms to truth values. The intuition is that a type is a proposition, nothing more. I can't seem to find much theory for such a model and that prompted this post. Is this highly inconsistent with current theory?

More formally, what would your initial reaction be to a system where:

λt:T.x  →  λt. if T(t) then x else TypeError

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

It Depends

The reason answers to 'What is a Type' are 'hand-wavy' is that it depends on the 'type system'.

Soundness of a type-system is well understood to mean 'progress' and 'preservation'. Progress means that any well-typed program in the type-system must always have a valid next step until reaching a final form (i.e. no undefined behavior, no getting stuck). Preservation means that predicted types are preserved across compute steps (i.e. no bad guesses).

So long as you don't violate soundness in your type-system, you can make anything you want be a 'type'. Not all potential types are useful, of course. But there is a very wide variety of known useful types: data types, phantom types (used many ways, though my favorite is for zero-knowledge proofs of security protocols), constraint types, dependent types, information bearing types (as utilized for type classes, meta-programming), sub-structural typing (most commonly linear typing and region inferences, aimed at garbage collection), effect-typing, etc.

So there is good reason people who have wide experience in the field often choose not to pin down 'What is a Type' except in rather generic terms that seem hand-wavy to others. That reason: "It depends."

But 'types as propositions', as you suggest, is actually well known. I'm blanking on the name - Curry Howard isomorphism, IIRC.

The use of first-class objects as types, as in Ruby and Smalltalk, is more part of the program expression itself than of any meaningful type-system.

Static/dynamic

The use of first-class objects as types, as in Ruby and Smalltalk, is more part of the program expression itself than of any meaningful type-system.

I think this should be emphasized. The way the term "type" is used in the static and dynamic worlds is just different. I doubt you'll find any definition that encompasses both of those usages without becoming absurdly general (like, "a classification system"), so I suggest simply looking for two distinct definitions.

Very roughly, static type systems tend to classify terms, while dynamic types (note that I do not say type "systems") tend to classify values. In this sense they have something in common, but really I think it's better to regard them as completely different.

Type theory means static type theory. There is really no useful theory of what dynamic languages call "types," although there are design cultures for specific languages, e.g, Smalltalk program design tradition.

I agree on the emphasis

I think this should be emphasized. The way the term "type" is used in the static and dynamic worlds is just different.

I agree on the emphasis, and add that types also mean something different in modeling languages. In particular, UML classes are often referred to as types, but we're talking about problem domain analysis and identifying real world entities; there is no math involved in doing so, other than relational modeling to ensure these entities pass the Indivisibility Argument in modeling the real world.

Short answer: beware and aware of the context of a vocabulary word.

What is a value?

A different approach to the problem would be to ask yourself: what is a value?

If you think about it a bit, you would have to answer: it depends on the formal definition of the language you are working with.

Normally people don't ask themselves this question: they assume that they already understand what a value is and don't think more about it.

When you sit down and really think about ANY abstract concept, it starts to get slippery, and that's OK: that's the beginning of really coming to terms with it. ;-)

Types = subsets

Given a domain D, the functions from D to the truth values are isomorphic to the subsets of D. So we can just as well say that types are subsets of values. For tractability, we can require that each type has a denumerable number of values, which assures that there are only a denumerable number of types. (In practice, of course, only finitely many values are representable, which means there are only a finite number of types.)

Infinities are tricky.

... each type has a denumerable number of values, which assures that there are only a denumerable number of types.

If a set is denumerable (size aleph null), the set of its subsets is not (size aleph one.) It only makes a difference in theory, since our computers all have finite memories. (Which is not to say that in theory is unimportant. It's just different from in practice in at least this one way.) (Posts correcting other posts always introduce new errors, and I'm sure this one does, too.)

Types ?= subsets

In practice, of course, only finitely many values are representable, which means there are only a finite number of types.

I'm puzzled by the statement that only finitely many values are representable. Could you give a definition of this concept? In my naïve understanding, the following Haskell code:

data Nat = Zero | Succ Nat

describes the classical type with countably infinitely many elements; and, while, on the one hand, we can certainly only write out explicitly a finite number of its elements in any given program, on the other hand, there is no element that can't be so described *. To take it further, there's a single program with the property that, for any given element * of Nat, it will ‘represent’ it in finite time:

let nats = Zero : map Succ nats in show nats

It thus seems to me that all elements of the type are representable. Of course, though, you know all this, which makes me think that I must not have the right definition of ‘representable’.

When I first tried to learn about a rigorous foundation for the theory of types (which is not too far away from when I last tried to learn about it, so take this with a grain of thought), I was immediately drawn, as a mathematician, to the idea of types as sets—but then I saw that types are not sets. Of course, the abstract immediately says “[t]he title is not a statement of fact”, and moves on to less provocative waters, but it has left me puzzled. For example, a term in most type systems has only one type (or, at best, a polymorphic type like Num a => a); but, in real life, an element of an infinite set will belong to infinitely many of its subsets (so we would need, in pseudo-Haskell, some sort of type constraint like term :: type `Contains` term => type, which seems so vague as to be useless). Again, this is all just coming from my attempts to generalise my mathematical understanding way too far into computer science. Could you clarify the meaning of the equation in your title?

* I'm intentionally ignoring the (non-)distinction between data and co-data that leads to constructions like

inf = Succ inf

because I don't understand it well enough to feel confident of not saying something foolish.

Types are second order

For example, a term in most type systems has only one type ... but, in real life, an element of an infinite set will belong to infinitely many of its subsets

The root of the difficulties with the "type = set" definition that others are pointing out is that that a set is a first order entity in the universe of sets, whereas a type is a second order entity defined over some other universe of terms or values, and is intended to partition those terms or values in some restricting way.

This pretty much precludes it being closed under all the same operations that sets are closed under.

PS I got a smile from your implied definition of "real life". ;-)

'Subsets' confuses too many

The 'types = subsets' notion too easily confuses readers, who take that and tweak it ever so slightly to 'subtypes = subsets', which fails to capture covariance vs. contravariance and the resulting multi-dimensional data-type lattices.

The 'subsets' notion also doesn't generalize well to types on the structure of the program or the relationships between values, such as dependent typing, constraint typing, linear typing, uniqueness types.

I've abandoned use of 'subsets' in attempting to explain types for these reasons.

Contracts

More formally, what would your initial reaction be to a system where:

λt:T.x → λt. if T(t) then x else TypeError

Well, if T is an arrow type T1->T2, then what does T(t) mean without static typing of the function body?

The answer is probably something along the lines of higher-order contracts. It seems to me that contracts come closest to being a proper theory of what is often called "dynamic typing".

Thanks

That paper was pretty much exactly what I was looking for. It is nice to see that typing is also being explored through dynamic semantics as well.

Semantic and syntactic notions of type

Section 3 of the lecture notes


http://okmij.org/ftp/Computation/FLOLAC/lecture.pdf

is entitled `What is a type?' and tries to answer the question. The
section shows four definitions, from semantic (a type is a set of
values) to syntactic (a type system as a syntactic discipline). It
seems the original notion of types, by Bertrand Russell is the most
unifying. Type is defined to be a subset of values (semantically, so to
speak) but it should be determined just from the form of an expression
(or, syntactically).

BTW, the Russell's definition is 101 years old.

Blast From the Past

Perhaps it's worth pointing to this and this very old thread on the subject. In particular, I still feel, intuitively, that Ontic has important things to teach us in this regard.