Type checking and logical errors

A recent correspondent asks jokingly "I suppose static typing allows for solving the halting problem as well?". This reminded me of an article where Mark-Jason Dominus (the Perl guy) describes how the type checker in ML found an infinite loop bug in a simple sort routine - - well, "found" in the sense of "gave enough information for it to be apparent to anyone paying attention that something was wrong".

I wonder then whether this is a one-off, or whether people who use ML and similar languages frequently find that the type checker points them towards bugs that even a dynamic-typing-language programmer would regard as genuine - - that is to say, errors, like infinite loops, that are not on the face of it just type violations? Does anyone have other examples?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Crashes

Crashes are caused by paths of execution that anr't closed or terminated. For instance a certain sequence of inputs causes something "unintended and unexpected" to happen. In terms of machine codes the execution runs over the end of a block and into an adjacent unrelated block, at this point anything can happen. Another example is the bad pointer problem. "Spaghetti code" makes this type of error more likely. Type checking seem to reduce the probability of this but as far as I know doesn't preclude it. Some people will put in GOTO error codes at points where execution should never reach. I wonder if there is any formal way to check for this? It seems to me that there could be.

Soundness

Type checking seem to reduce the probability of this [crashes] but as far as I know doesn't preclude it.

Of course it does! That's the whole point of soundness proofs for type systems. Granted, current mainstream languages rarely have a formally defined type system, let alone one that's sound. But the typed languages more popular in this forum, e.g. ML and Haskell, have (modulo some explicit unsafe operators intentionally put into some of them).

Regarding gotos: of course, they don't exist in any decent high-level language. :-) But nevertheless they can be typed in a sound manner. Have a look at typed assembly language, for example.

Hard errors often manifest easily detectable symptoms

Human developers, particularly experienced ones, tend to write very stereotypical code. They also write code that is either correct or very close to correct, differing from correct code in only one or two particulars. Given this, static error detection in human-written code is vastly easier than the halting problem, for an enormous set of possible errors. Just code up detectors for a bunch of common bug patterns and your bug yield can be very high. For instance, you can't write an analysis program which will find all infinite loops, but for many languages you can write analysis programs to find loops where the exit condition is vacuous, unreachable, or missing and it turns out that that's most of the infinite loops that programmers actually write. Do a bit of research into the sorts of bugs developers actually make, and there are tons of easily detectable bug patterns you can statically analyze.

For non-terse, well-behaved, statically-typed language, many static error analyses are so easy that it doesn't surprise me at all that a type checker could be coaxed into doing some of them.

metal

Wow

Thanks for the pointer on this. I hadn't read these papers before, but seem to have been living them for the last few years. I love their idea of statistically inferring intended program invariants and then driving static analyses off of those inferences. Expect to see a commercial implementation of that for Java by year-end.

It was spun into a

It was spun into a commercial product (though for C/C++ I think (?)). But yes, it was impressive. One thing though is that I think the technique would be much less useful in languages LtU people usually prefer. But this is a completely (educated) guess; it would be interesting to see how it works out.

Coverity

Coverity is the company your thinking of. Their website (which seems to be having some trouble right now) mentions analysis for C,C++, and Java. At the last linuxworld, their CTO mentioned in passing an employee creating a stack usage analysis for php.

The interesting thing about coverity vs e.g. Polyspace, is the decision to emphasize scale over soundness/completeness. Coverity can work with huge codebases (e.g. the Linux kernel or Mozilla) and catch many errors, but does not guarantee the no runtime errors occur.

I recall seeing a similar

I recall seeing a similar report from, IIRC, Andrew Koenig. The trick is that an infinite loop, or bottom, can be given an arbitrary type. Consider the following example:

fun loop () = loop ()

That function is inferred to have the type

val loop : unit -> 'a

The 'a as the return type (meaning: the result of the function can be given an arbitrary type) shows that the function never returns normally.

I write SML code daily and my experience is that when I get the code to type check it works like I thought it should work.

I'll Try to Say This Politely

Mike Smith: I wonder then whether this is a one-off, or whether people who use ML and similar languages frequently find that the type checker points them towards bugs that even a dynamic-typing-language programmer would regard as genuine - - that is to say, errors, like infinite loops, that are not on the face of it just type violations?

The misapprehension that I frequently find on the dynamic typing side is the belief that static typing can't treat things beyond just ensuring that an integer is used as an integer, a float as a float, etc. The phrase "just type violations" is a huge red flag indicating that the writer's understanding of type systems is in need of enhancement.

vesa_karvonen's experience mirrors mine: O'Caml gives me an order of magnitude more of that "if it compiles, it works as it should" experience than, say, my Java day-job does (!), and my Java day-job gives me an order of magnitude more of it than, say, my C++ day-jobs and "recreational" (in the sense that masochism can be said to be "recreational") programming did.

Granting all that does the

Granting all that does the O'Caml type system actually do anything to detect crash type errors, or is this due to enforcing a certain level of rigor. Is the problem solved or is it less likely?

Never Say Never

Hank Thediek: Granting all that does the O'Caml type system actually do anything to detect crash type errors, or is this due to enforcing a certain level of rigor. Is the problem solved or is it less likely?

I'm very reluctant to say it's impossible to crash the O'Caml runtime. I can honestly say, however, that I never have. On the other hand, I haven't yet done anything with truly enormous datasets in O'Caml, so I likely haven't bumped into some of the runtime limits that are being discussed on the mailing list. Apparently some of the standard library routines aren't tail-recursive, for example, and so using them on large constructs runs the risk of stack overflow, and there are some challenges for the runtime in catching stack overflow correctly on some popular platforms. Most folks in the O'Caml community know that these functions aren't tail-recursive, and so tend to use the ones from ExtLib preferentially, but that of course begs the question as to why you should have to know to do this.

With all that said, let's be clear that that reliability's not a function of being statically typed, but of being strongly typed: I've never crashed DrScheme, SBCL, Python, or Ruby, either.

OCaml serialization is not type safe

input_value can produce a value which crashes the runtime.

The analogous problem doesn't exist in dynamically typed languages.

Sure It Does

QrczakL input_value can produce a value which crashes the runtime.

The analogous problem doesn't exist in dynamically typed languages.

Sure it does, if you don't restrict yourself to only reading values that were serialized out from the language in question, and that's still assuming that those serialized values are tagged (which, in dynamic language, of course will be the case). So all you're really saying is that since you have tags at runtime, you can use them at runtime to mark stuff that isn't even in the runtime. That's a truism. And of course O'Caml marshaling isn't type-safe; marshaling/unmarshaling and using the Foreign-Function Interface are the well-known "interaction points with the outside world" that require some kind of external support (for example, Saffire for checking use of the FFI and HashCaml for type-safe marshaling).

The fact that both the popular dynamically-typed languages and pretty much the entirety of the statically-typed languages other than C/C++ are memory-safe is indeed one of the several pleasant things that they have in common. What I appreciate about a good inferred statically-typed language is that it's dramatically safer even than that, where "safer" includes definitions like "behaves unexpectedly to the user less." Less than what? That's where the question lies, isn't it?

Same problem in dynamic languages

The analogous problem doesn't exist in dynamically typed languages.

Sorry, but that's pretty naive. Dynamic "typing" mainly helps with external consistency of read values, i.e. checks how they are used. If the language and its serialisation mechanism are rich enough - in particular, if values can include code - then you also will have to perform all kinds of non-trivial checks to verify internal consistency of such values, in order to ensure integrity of the runtime.

Almost no existing language implementation does that. For instance, it is no problem to feed Oz a malformed pickle that crashes the VM.

White flag

My understanding of type systems is in need of enhancement? That's why I'm here!:)

LOL

Mike Smith: My understanding of type systems is in need of enhancement? That's why I'm here!:)

Yeah, me too, but let me suggest that you hie yourself (if you haven't already) to TAPL and work through the exercises. I'm doing it, albeit in my Copious Free Time™ so it's taking longer than I'd like. But the only more eye-opening experience I've had is reading CTM. :-)

Soundness

Not being a computer science person this may be a bogus question but I am really curious. Suppose I have a "sound semantics" and I write a C interpreter for that language; is the result sound?

Epigram

I was just thinking about termination guarentees in the type system, and how far this should go.

Epigram distinguishes between structural and general recursion, using different operators for each. This means you to retain decidable type checking for dependent types: by only allowing functions that are guaranteed to terminate in type-forming expressions.

Other people have suggested non-termination should be put into monads along with other side effects. I'm not sure how far this should go - making termination a part of the function signature would probably damage modularity quite a lot.

If you prove it...

[in reply to Hank Thediek's above post, sorry for not following up properly]

If you prove your implementation correct then yes. ;-)

Of course, you can ask the same question one level deeper: is the C compiler correct? Is your CPU correct? This is related to the problem of trusted base in security.

But the point is, when a program of yours crashes, but the language you wrote it in is known to be safe and sound, then you know it was not your fault, but that the implementation is buggy (or something at an even lower level). A sound language hence represents an important, impenetrable abstraction barrier, which has all kinds of obvious and less obvious advantages.

Now that is exciting! Thanks.

Now that is exciting! Thanks.

Neat example

The type of sort is Ord a => [b] -> [a] with no annotations, which proves that sort cannot possibly do anything meaningful. I wonder if this is a case for having compilers to compute principal types without paying attention to annotations first, since an annotation completely hides this defect.

I don't understand

How Ord a => [b] -> [a] is the type of sort ...
I've never seen behaviour like that from GHC. Can you clarify?

It refers to this.

It refers to this.

Thanks!

It is a neat example. And I now understand Koray's point about annotations possibly masking such defects.

hehe

None of the examples I've seen so far preclude implementation in a dynamic or psuedo-dynamic language. The problem isn't necessarily detection, as demonstrated by the use of the phrase "enough information for it to be apparent to anyone paying attention that something was wrong." It would be ridiculous to, say, expect the programmer to declare any loops that aren't purposefully infinite as such. This obviously points to the use of warnings, which suggests it isn't so much a static vs. dynamic thing.

Stipulation

Let's stipulate that you can check anything that's computable at all at runtime. So that's not a very interesting observation. What's interesting is how far you can go in the other direction: how much you can guarantee before runtime. It turns out that even in today's functional languages, e.g. O'Caml and Haskell, the answer is "a lot farther than you think unless you've been doing it for years." When you consider how much farther still you can go with either recent additions (e.g. GADTs) or still-not-common ones (dependent types), it becomes very exciting indeed.

Also, don't forget, there's a class of problems (concurrency, distribution, proof of adherence to a security policy) that it's simply incoherent to talk about "solving dynamically," as there's no meaningful sense in which "handling at runtime" can be made equivalent to "I've proven that this problem doesn't exist before I've run the code."

I wasn't speaking about

I wasn't speaking about checking at run time, merely that compile time warnings are applicable to both static and dynamic languages.

Soft Typing?

But, if you start checking types at compile type you essentially have static typing, right? Well, either that or soft typing. Personally I think that a soft type checker, done right, would be the way to go if programmers didn't start ignoring the warnings of the type system. Unfortunately it seems to be human nature to ignore warnings until its too late.

Sort of. Warning at compile

Sort of. Warning at compile time is most certainly in the relm of dynamic typing, but actual errors would be possible in a hybrid approach.

Phantom types, parametric polymorphism

Many uses of phantom types (unused type variables, not first class phantom types/GADTs) and parametric polymorphism establish invariants that would be very difficult perhaps even impossible to deal with in a dynamically typed language. One example of the latter, is the ST monad of Haskell whose extra type variable and rank-2 runST function establishes an invariant that would be awkward at best to recreate in a dynamically typed language.

I consider phantom types a strong example of a benefit of static typing over dynamic typing. Phantom types involve adding completely "unnecessary" restrictions to functions in order to maintain a "semantic" invariant. There is no correspondence with dynamic typing as the information included has no connection to the data representation. Further, this is not just some theoretical construct, but is used out there in the wild.

Dependent types and related schemes definitely lead to the catching of "interesting" errors.

Many uses of phantom types

Many uses of phantom types (unused type variables, not first class phantom types/GADTs) and parametric polymorphism establish invariants that would be very difficult perhaps even impossible to deal with in a dynamically typed language. One example of the latter, is the ST monad of Haskell whose extra type variable and rank-2 runST function establishes an invariant that would be awkward at best to recreate in a dynamically typed language.

Could you give an example of this? I suspect this isn't the case, but I'm not fully understanding what you mean.

You can't return an STRef

You can't return an STRef from the invocation of runST it was created in. The code won't compile.

ST

More interesting, you can't use a reference inside one invocation of ST which was created by any other, even a nested one. This means that running the same computation several times will give the same answer, executing an ST computation is a pure operation, even if the computation uses variables internally.

ST would be much less interesting in an impure language. With respect to referential transparency, ST is like drilling a hole in a pot and then patching it - if you start with a sieve instead, why bother with the patch?

Phantom Types Example

A simple example of phantom types: ensuring that reads and writes are threaded over uses of a "slot" type. Here, 'a is the (real) type of the "slotted" values, and 'b is the phantom type:

module type SLOT = sig
   type ('a, 'b) t
   val from : 'a -> ('a, [`Full]) t
   val get : ('a, [`Full]) t -> 'a * ('a, [`Empty]) t
   val put : 'a -> ('a, [`Empty]) t -> ('a, [`Full]) t
end

For reasons that will be clear below, you cannot create an empty slot, only a slot filled from an initial value (this is a limitation of the implementation I chose, not of the technique). Also, for phantom types to work, the phantom type parameter must remain unbound in the interface, by keeping the type abstract. The implementation is like this:

module Slot : SLOT = struct
   type ('a, 'b) t = 'a
   let from x = x
   let get x = x, x
   let put x _ = x
end

Really, that's it! The type checker ensures that all gets and puts are interleaved:

# let x = Slot.from 1 ;;
val x : (int, [ `Full ]) Slot.t = <abstr>
# Slot.get x ;;
- : int * (int, [ `Empty ]) Slot.t = (1, <abstr>)
# Slot.put 2 x ;;
This expression has type (int, [ `Full ]) Slot.t but is here used with type
  (int, [ `Empty ]) Slot.t
These two variant types have no intersection

In contrast, a traditional, mutable, object-oriented, explicitly checked implementation would be something like this:

class ['a] slot = object
   val mutable slot : 'a option = None
   method empty = slot == None
   method put x = match slot with
   | None   -> slot <- Some x
   | Some _ -> failwith "put"
   method get = match slot with
   | Some x -> slot <- None; x
   | None   -> raise Not_found
end

Clearly, you have to do all the grunt work for yourself, in particular, you have to test explicitly in the client code whether the slot is empty or not.

Ah, I see. Clever. What

Ah, I see. Clever. What about:

data Slot:
    val

type Full = Slot(x => x.val!=None)
type Empty = Slot(x => x.val==None)

def put(x:Empty, y):
    x.val = y

def get(x:Full):
    val = x.val
    x.val = None
    return val

def create_slot(x):
    return Slot(x)::[put=put, get=get]

Dependant types to the rescue ;-)

What about something

What about something like:

data VariableStore:
    ...

def new_var(variable_store, name, value): ...
def get_var(variable_store, name): ...
def set_var(variable_store, name, value): ...
def commit(variable_store): ...
def retry(variable_store): ...

# Excuse the weird syntax; this creates a "class" of sorts
def create_variable_store(...):
    return VariableStore(...)::[
        new_var = new_var,
        get_var = get_var,
        set_var = set_var,
        commit = commit,
        retry = retry
    ]

# This basically strips off any previous member functions and 
# only puts the 'get_var' function back.
def read_only(variable_store):
    return class_data(variable_store)::[
        get_var = get_var
    ]

which would be used like:

vars = create_variable_store(...)
use_variable_store(read_only(vars))

Which can ensure that

Which can ensure that use_variable_store isn't able to write, but can't check that it won't try to write. If I write the above code, and pass a read_only(vars) to a function, I have no reason to believe that it won't try to call new_var, set_var, etc., and raise an error at runtime.

I know, that was the intent. Of course, I'm assuming there are ways to forcedly cast the value to a different type when using phantom types so I decided to mimmick this behavior. If you'd like, you could use a proxy object to completely forbid access to the raw data structure and therefore modification:

def read_only(variable_store):
    return None::[
        read_var = (self, *args => variable_store.read_var(*args))
    ]

Or, if you don't care about

Or, if you don't care about keeping the same interface in case the function might eventually need write priveleges, you could just pass the read_var function around:


vars = create_variable_store(...)
use_variable_store(vars.read_var)

While there's no value of

While there's no value of phantom type, there's a value whose type is parameterised on one - I guess a term for that might be useful if there isn't one already?

There is no value of a

There is no value of a phantom type (hence the 'phantom'). If can have an action of type Int -> STM Full (), the Full part is nothing more than a type -- there is no value of type Full.

I know. What I'm getting at is that it could be possible to, say, create a function that converts ReadOnly phantom types to Full, circumventing the type checks.

Using a proxy object still won't give us any reason to believe the code correct without showing that the proxy is used appropriately, i.e., that the function has an appropriately restricted type.

How is using phantom types different from this?

It's not possible to

It's not possible to circumvent the phantom typing if you hide the relevant constructors.

Such a function is trivial

Such a function is trivial and only circumvents the type checks in that it makes explicit the fact that Full subsumes ReadOnly:

It also has the effect of making it impossible for the compiler to completely guarantee. I just reread your previous assertion in which you state it would only be checkable at run time. This is false; my approach allows for a compiler to check it at compile time, exactly like your example.

Guarantee what? We WANT to

Guarantee what? We WANT to be able to do such things. If we cannot do such things, then an action that we label Full cannot read or we require some mechanism to ascribe multiple types to the read-only actions, or we need to start building type-level collections. As it is, we just use the type system, which is quite capable of proving this for us; without the use of unnecessary functions performing the 'cast' for us.

Sorry, I'm not particularly good at articulating my thoughts :) What I'm talking about is using a function to cast a ReadOnly STM into a Full STM, thereby circumventing the checks put in place by the type system. If that's not possible, then disregard the first solution as equivalent, as you'd need a proxy object to emulate that sort of behavior.

You Do Need the Proxy...

...because abstract types in ML are inviolate (well, OK—O'Caml has Obj.magic, but anyone who uses it without submitting to the requisite 40 lashes from Xavier Leroy is a fool).

In any case, however, the point of phantom types, again, is that wrong code won't even compile, and you'll get a pretty clear error message indicating why it didn't compile. This is why, generally speaking, I prefer a good inferred statically-typed language these days.

In any case, however, the

In any case, however, the point of phantom types, again, is that wrong code won't even compile, and you'll get a pretty clear error message indicating why it didn't compile. This is why, generally speaking, I prefer a good inferred statically-typed language these days.

My approaches allow for compile time checking exactly as you described, as well :-)

Well...

The punchline of Dominus' story is that the inferred type of his sort function was not what he was expecting: 'a list -> int list rather than int list -> int list. The inferred type is a static property; the whole point is that Dominus did not have to run the sort to determine it had a bug and would never terminate. In my experience, it is reasonably common for buggy code to show up as unexpected typings in OCaml.

Now, a soft typing system could give exactly that same message as a warning, but Dominus goes on to give a short list of some Perl constructs that would fail to typecheck under an ML-ish type system:


while (<>) {
$total += $_;
}
...
@a = localtime();
$s = localtime();
...

These are pretty common Perl idioms, and would definitely qualify for a "stupid type checker whining about perfectly valid code" complaint. A type checker that would work for Perl clearly would have to be very different from that of ML, and has some open questions: Would it give excessive numbers of false warnings? (Ultimately, any might well be too many.) Would it give the same level of "after I fixed the typing errors, the program just works"? (Perl's base types are probably something on the order of scalars, arrays, and hashes rather than ints and floats; and silent conversions are very common.)

The key difference I see is that the semantics of ML and its type system are closely linked---for example, adding a string to a number is completely alien to ML---something that is forbidden to soft type checkers.

Thanks!

That's a great explanation as to why I'm extremely skeptical of "soft" or "pluggable" type systems for languages that weren't designed with static typing in mind: it just seems mind-bogglingly unlikely that you'll get the benefit from them that you do from an inferred statically-typed language in which the semantics and type system are closely interrelated.

Missing an important point

Curtis seems to be following a common line in the static vs. dynamic debate, saying that you could catch errors in dynamically code at "compile" time if you wrote some kind of static analysis. Of course!

From this point of view, the interesting thing about (good) type systems is that somebody has already written the analysis, and that you can leverage that one analysis to prove properties you care about, with just a little work.

But, once start talking in these terms, then you are no longer talking about the what errors can be caught by a static type system (the original question), but debating the merits of catching those errors statically by using the typed system of a designed-to-be-typed language, or by writing an analysis tool for a dyanmically typed language. Maybe somebody should make a topic for that discussion?

What? I'm confused--I'm

What? I'm confused--I'm talking about type checking in the compiler, not in an external tool. Anything that can't be proven false at compile time by the compiler would be accepted, at which point the editor can step in and do warnings if it so wishes, but I'm not sure if that's what you're talking about.

Parametricity and Fusion

In type systems with parametric polymorphism, you can prove some term has a type forall a.T(a) if you can show that the term has type T(A) where A is some freshly invented type constant. Without things like unsafe casts or unrestricted reflection, you can't do anything with values of types like that except pass them around.

This means that just knowing the type of a polymorphic function gives you some information about what it does. For example, the identity function has type forall a . a -> a. The function is only given a single a, so all it can do is return that value, or fail to terminate.

An interesing application is the foldr/build rule for list fusion. These are functions,

foldr :: forall a b . (a -> b -> b) -> b -> [a] -> b
foldr f x [] = x
foldr f x (a:as) = f a (foldr f x as)
build :: forall a . (forall l . (a -> l -> l) -> l -> l) -> [a]
build k = k (:) []

foldr reduces a list to a single value, by replacing conses with its first argument, and nil with the second. build takes a "list"-building function parametric in the "cons" and "nil" is uses, and uses it to build an ordinary list. For example,

foldr (+) 0 [1,2,3] = 1 + (2 + (3 + 0)) = 6


and if we define

k cons nil = cons 1 (cons 2 (cons 3 nil))


then

build k = k  (:) [] =  (:) 1 ((:) 2 ((:) 3 [])) = [1,2,3]

The fusion rule says that when foldr is used to consume a list constructed by build we can skip making a list: foldr f x (build k) = k f x. Parametricity tells us that because k is polymorphic in l, the only way it can be producing its result is with the provided functions. Following the example above, we have

k (+) 0 = 1 + (2 + (3 + 0)) = 6 = foldr (+) 0 (build k)

In the presence of casting (say, a dynamic langauge or a soft typing system) this all breaks down - we could instead have defined

k cons nil = cons 1 (2 : (cons 3 nil))


which works fine as an argument to build, but gives nonsense instead of fusion:

k (+) 0 = 1 + (2 : (3 + 0))

Correct Type-checker in Epigram

One of the examples in the Epigram paper The view from the left is an AST and type checker for a simply typed lambda calculus, where the typed AST can only represent well-typed programs, and the type of the checker is a proof that it is actually checking the term provided, and if it says it passed then you really did get a typing for your program.

They don't mention any particular bugs this caught, but any time you see "this type guarantees the correcteness of that program" you could read "this type catches any bug that would possibly be made in writing that program". :)