Tail of Nil and Its Type

Epigram: practical programming with dependent types

Find the type error in the following Haskell expression:


if null xs then tail xs else xs


Found it? Of course you haven't. But this program is obviously nonsense! Unless you're a typechecker, that is. The trouble is that only certain computations make sense if the null xs test is True, whilst others make sense if it is False. However, as far as the type system is concerned, the type of the then branch is the type of the else branch is the type of the entire if . . . then . . . else . . ..

Statically, the test is irrelevant. Which is odd, because if the test really were irrelevant, we wouldn't bother doing it.

We mentioned this issue in discussions. LtU recently featured Epigram with emphasis on its interactive programming abilities, but I would like to add that Conor McBride's papers are mostly about practical use of dependent types for programming. They are also fun to read, though I can easily imagine how his experiments with presentation, like this one, can annoy some people.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

NullPointerException

I see the empty list as being like null, which Nice has static checking for.

Maybe

One can use Maybe to express null-like semantics in Haskell1, but that will only postpone the problem.
I mean, even if we have a type of NonEmptyList and PossiblyEmptyList, we only get guarantee that a tail of NonEmptyList exists. The tail must be of type PossiblyEmptyList, though, so we returned to the original problem of hesitating to take the tail of a list for the fear of, eh, an error that is not a type error.


1Of course, one can say that it's null that is trying to express Maybe-like semantics :-)

Total functions

IMO it's a problem of not using a total function to define tail. If we did so, this would be easily solved:

tail :: [a] -> Maybe [a]
tail [] = Nothing
tail (_:xs) = Just xs

tailOfTailOfTail :: [a] -> Maybe [a]
tailOfTail xs = tail xs >>= tail >>= tail

This also works for NonEmptyList and PossiblyEmptyList types, we just need to define an overloaded tail for each one.
Also we can always use the type system to encode the size of the list, but without true dependent types this task is much harder (e.g. the error messages are truly adorable).
IME it's a problem of programmers having expectations that exceed those provided by the type system (and language designers introducing hacks like error which are nice in development but awful during testing and maintenance).

Thanks, but...

I actually had monads in mind, but did not feel my Haskell is good enough to demonstrate this implementation.

But as you provided it for free...
The problem with this solution (for me) is it just lifts error to Nothing, requiring the programmer to decide what to do at runtime when this really happened. There is no way for the programmer to express his unwillingness to even think about such atrocities as beheading empty lists. Instead of compiler rejecting the programs that could possibly go wrong, the programmer has to provide a fire escape for programs to go when they feel like going wrong.

Goal directed programming

There is an advantage though. Like the example shows, Nothing doesn't have to be handled. Any list that isn't big enough is *automatically* ignored. So the programmer may not have to do anything, if lists that are too small aren't an error, but just a case that doesn't apply to the rest of the computation. This is goal directed programming.

It may return Nothing :-(

Any list that isn't big enough is *automatically* ignored.

"Ignored" means the whole computation delivers Nothing.
Using Maybe for this is like saying - let's turn this partial function into a total one by lifting its range to include Nothing. It became total by obtaining permission to return something I have no use of. I do not say monads are not useful, or Maybe is not useful.
It just does not solve my problem. As a user of tail function, I want it to return List. Always. I recognize the right of the author of tail function to expect something from me in exchange. I am willing to satisfy his expectations, and I want the compiler to verify our mutual respect. The problem is, Haskell does not give us the means for that (or I am not aware of them).

That's why >>= is used

Instead of f (tail xs) you write tail xs >>= f. (I think, I don't use Haskell.)
So f expects a List, and it always gets a list, because when tail returns Nothing, f is ignored.

In your words f is the user of tail, and it doesn't have to know about Maybe.

Yes, but...

Instead of f (tail xs) you write tail xs >>= f. (I think, I don't use Haskell.)
So f expects a List, and it always gets a list, because when tail returns Nothing, f is ignored.

Yes, but this is just poor man's exception handling, and it has precisely the same problems as "error". In fact, it's precisely the same thing, without the syntactic help.


The point is that sometimes when I write "f (tail xs)", I don't want to think about the case where xs is empty, because I don't believe it can happen. Either you say, "Obviously it CAN happen and you need to do something reasonable" (handle the exception somewhere in your code, or handle the Nothing somewhere in your code: they're equivalent), or you say, "Let's prove statically that it really can't happen, and we can all go home happy."


I'd much rather do it the static way if I can. No matter how I wrap up the unexpected case at runtime, it's still there, and it's still unexpected...

Preach to the compiler

I don't want to think about the case where xs is empty, because I don't believe it can happen.

Well you may not believe but the type system does (or is lied to in the use of error) because it can only see the static behaviour of the system (i.e. what's encoded in the types). So you need to make the compiler a believer and encode this property in the type (as Oleg does in his Haskell articles).

Excuse me for not being clear enough

So you need to make the compiler a believer and encode this property in the type (as Oleg does in his Haskell articles).

Of course, and I did not object to that.
It just the particular language between me and compiler that I am not happy with. I know that Haskell implementations are looking for expressiveness, adding features like multi-parameter type classes, implicit parameters, existentially quantified data constructors, and much more. I may be not right, but from aside this looks like "agglutination of features". Dependent types claim to solve most of these problems in a unified way. I do not wish to denigrate Haskell, its community, or researchers. I do not think I would succeed if I wanted. They all did a great job, and are still doing it. If I never saw Haskell I would not even try to read dependent type literature. Before learning FP I was quite happy with Java, for that matter. The more one gets the more one wants.

I started this thread mostly because there was not much dependent type systems discussion on LtU. People tend to dismiss them as undecidable, or unwieldy, or too academic. I think we need to change that.

I really hope there are no feelings hurt.

"it's still there, and it's still unexpected"

I'd much rather do it the static way if I can. No matter how I wrap up the unexpected case at runtime, it's still there, and it's still unexpected...

Exactly what I wanted to say, but obviously failed.

It's worse than that

Yes, but this is just poor man's exception handling, and it has precisely the same problems as "error". In fact, it's precisely the same thing, without the syntactic help.

Even worse, Maybe is promoted to the place where it has almost zero semantic content. In Haskell, all types (well, of kind *, if you're using GHC with unboxed types) are lifted already. We don't need to do it manually here.

The other problem I have with Maybe is the problem that I have with Bool. Most instances of Bool should really be a custom two-valued enumeration. So instead of this:

data Foo
  = Foo {
        data :: FooData,
        hasPropertyX :: Bool
    }

we should really write:

data PropertyX = IsX | IsNotX

data Foo
  = Foo {
        data :: FooData,
        propertyX :: PropertyX
    }

I can't tell you how much frustration I've had and how many dollars I've cost employers trying to remember precisely what a flag means. Doing this in the first place would have save a lot of hassle.

Similarly, with Maybe, you fast end up in the situation where you have some Maybes being meaningful and others being manually lifted types, and programmers unable to tell which is which.

It should be some form of map

Otherwise f should return some monadic value (i.e. (>>=) :: Monad m => m a -> (a -> m b) -> m b).

map :: Monad m => (a -> b) -> m a -> m b
map f m = m >>= (return . f)

In Haskell fmap from the Functor typeclass has this definition.

I'm not sure I understand what you're saying

There is no way for the programmer to express his unwillingness to even think about such atrocities as beheading empty lists.

Well as the list type has two constructors, when a expression uses a list value it must be able to deal with both cases. Otherwise you need other kind of datatype. IMO head/tail as partial functions are inherently wrong.

Instead of compiler rejecting the programs that could possibly go wrong, the programmer has to provide a fire escape for programs to go when they feel like going wrong.

IMO the compiler does reject programs that could possibly go wrong, but as the program accept "erroneous" values the compiler requires the programmer to decide what he would like to do in the other situations. The compiler can't reject what he think is correct and according to the given types the program is correct. If you need static guarantees you need to tell them to the compiler (i.e. make them part of the type) and if you don't have a dependent type system you can't just propagate boolean constraints.

Such tasks are similar to parsers: if you need to parse digits from a string you must handle the case where the string has invalid characters.

I say - Haskell is not "static" enough

My point is that Hindley-Milner1 type system is not expressive enough to allow two (or more) consent programmers to state their mutual expectations.
As an author of tail function I can't promise to always return list given non-empty list and expect the compiler to reject all programs that try to call tail on empty list.

I suspect it's possible to stretch Haskell a bit to encode this particular constraint, but as you said, dependent types are way to go for those who seek more static expressiveness.

if you need to parse digits from a string you must handle the case where the string has invalid characters

Parsers are living on a steep border between lands of different structurization degree. Let them suffer. There are a lot of problems located on the plains (or plateaus) - my care is about them in this thread.


1I greatly respect them both.

My point is that Hindley-Miln

My point is that Hindley-Milner1 type system is not expressive enough to allow two (or more) consent programmers to state their mutual expectations.

I refute this, as Oleg showed in another post. Saying that I agree that the solution is more verbose than we expect. If the compiler infered constraints and propagate them, it would be easier for the programmer to state his expectations.

Dependent types can see where simple types can only look :-)

Parsers are living on a steep border between lands of different structurization degree. Let them suffer.

It does not mean dependent type make this suffering worse than it was before them. At least one gets explicit and does not lie to his compiler :-) Sections starting from 17 deal with problems like this (for example, obtaining bounded natural from unbounded one, or typed terms from untyped ones).

After taking this pain on the interface with less typed systems, you can enjoy nice dependent types throughout your code. It really feels the same as integrating ST and DT programs.

"Well-typed programs don't go wrong"

Of course, head and tail don't go wrong when you apply them to empty lists, because well-typed programs don't go wrong, so we'd better pick a different word to describe the way they do go.

Any suggestions, which way they do go?

Nowhere

They go nowhere.

Or do they?

Matthias Felleisen gives a good overview of the meaning of "well typed programs don't go wrong" in
this c.l.scheme post.

Dependently Typed > (Independently?) Typed

...or "All languages are typed but some are more typed than others".

Following the trend, I would like to present another theorem imposing some order on PLs.

Dependently Typed PL > (Independently?) Typed PL

As S → T is just a special case of ∀x : S ⇒ Tx, we can trivially obtain an inclusion of all, eh, independently typed programs into dependently typed ones. QED

Good point!

I like it, Andris, partially because it's clever and partially because it's true. In fact, this is also a proven result, and it comes with a rather serious border being crossed: in dependently-typed languages, type checking/inference is undecidable. It's possible to write code that will not only go into an infinite loop at runtime, but at compile time.

When I first learned of this in the context of Cayenne, I was horrified, and in fact never even downloaded Cayenne to take a look at it. What a buffoon! Only much later did I learn that template instantiation in C++ is also undecidable, and all real-world compilers impose a maximum template instantiation depth beyond which they simply refuse to go. QED.

In any case, it's quite true that dependently-typed languages have strictly greater expressive power than independently-typed ones, which have strictly greater expressive power than untyped ones.

Not necessarily undecidable

in dependently-typed languages, type checking/inference is undecidable

Not necessarily. In Epigram type checking is decidable, because all functions are total. Conor McBride plans to add general recursion, but its use would be tracked by the system so that types can only depend on terms from the total fragment.

General recursion in Epigram

Conor McBride plans to add general recursion, but its use would be tracked by the system so that types can only depend on terms from the total fragment.

Can you disclose your sources? I was under impression that uses of general recursion will be marked as suspicious (equivalent of unsafeIO for hardcore hackers?).

Same source

Your impression doesn't contradict my statement: “marked as suspicious” is what I meant by “tracked”.

Anyhow, I found the reference and it's the same as yours:
Epigram: practical programming with dependent types, § 7:

When I get around to it, Epigram will support definition by general recursion, but such programs will be explicitly marked as suspicious and hence never executed during elaboration.

Anyone interested in some lectures?

I do not see clearly what that would mean :-(

This gives me an idea, why not invite Conor McBride to LtU?

Feel free to invite him

If anyone is in touch with him and wants to invite him, feel free to do so. I am also open to suggestions as regards guest bloggers (hint, hint).

Promoted

I promoted this thread to the home page.

Exception-free programming in Haskell98

I'd like to point out that it is possible in Haskell98 to write non-trivial list-processing programs that are statically assured of never throwing a `null list' exception. That is, tail and head are guaranteed by the type system to be applied only to non-empty lists. Again, the guarantee is static, and it is available in Haskell98. Because of that guarantee, one may use implementations of head and tail that don't do any checks. Therefore, it is possible to achieve both safety and efficiency. Please see the second half of the following message:

http://www.haskell.org/pipermail/haskell/2004-June/014271.html

The only change we have to make is in our point of view.

Is it Haskell98?

Again, the guarantee is static, and it is available in Haskell98. [...] Please see the second half of the following message [...]

Do you refer to the code that uses implicit parameters?
Or do you mean the third third part of that message? The one with "We cannot avoid run-time checks" in it?

Presentation Style - It's just plain Brilliant!

They are also fun to read, though I can easily imagine how his experiments with presentation, like this one, can annoy some people.

Personally I think that is just plain brilliant.

Far far to much time is wasted on slick power point presentations, when a quick whiteboard sketch is way better.

My Rule of Thumb for conferences is the content value of a talk is inversely proportional slickness of the presentation.

With densely written handwritten overhead slides that expand your mind at one end of the scale, and slick vendor supplied vaporware at the other end.

This presentation style has all it should. Simplicity of write and rub overhead slides. The ability to add cartoon like pointers and sketches. Verbose explanation on the RHS. Ability to create bring in points one by one.

This is just plain really Good. I love it!

I love it, too

It's just a non-conventional method, and as such can meet more criticism than mediocre PPT. Hope I am wrong.

Using subtyping

I agree with Daniel Yokomizo that the problem in this code is that tail is a partial function. His solution, in Haskell, is to make it total by extending the codomain, that is return an option type.

If you have atomic subtyping, there is an alternate route, which is to define tail on non-empty lists only (that is, reduce the domain of tail). In Nice:

abstract class List<T> {}
class Nil extends List<T> {}
class Cons extends List<T> { T head; List<T> tail; }

Here tail is a field, which when seen as an accessor function has type <T> Cons<T> -> List<T>, and it's total.

Now, the trick is, how do you get the guarantee that you can call tail? Simple, by testing if the list is a Cons or not. Since Cons is now a type, the compiler tracks that test, and therefore can ensure that the call is valid:

List<String> l = ...;
if (l instanceof Cons)
  l = l.tail; // valid, here l has type Cons<String>

Often, you would actually do the test using (multi-)method dispatch. For instance, computing the length of a list:

<T> int length(List<T>);
length(Nil l) = 0;
length(Cons l) = 1 + length(l.tail); // l has type Cons<T>

Personally, I find it more natural to state that tail only operates on non-empty lists than to allow it to return Nothing.

Tail of tail of...

[Nice code]1

Ok, this enables me to write method of type, say

<T> Cookie<T> magicCookie(Cons<T>)

which is guaranteed by compiler not to receive empty list.


How do I declare a method that wants lists not shorter than 3? Two lists of the same length? N lists of length N each?


One can probably solve some of these problems in non-dependent type system, and some of these solutions are even beautiful. However, they seem to be ad hoc, and require an ingenious mind.


Existence of hackers who could program any computation in machine code didn't prevent people from inventing Fortran ;)


I would be interested to hear, whether people do not consider it important to express such properties statically.


Or are these properties easy to express without dependent types?


Or are they hard to express with them? :)

1One can, of course, provide similar solution in Haskell.

Static properties

I would be interested to hear, whether people do not consider it important to express such properties statically.

I consider it very important, since I learned about Eiffel years ago.

Or are they hard to express with them? :)

I'm not an expert but some of the examples in the papers can get pretty convoluted. Sometimes we just want to write a predicate that should be true. I wonder if instead of using dependent types it would be simpler if we added a (possibly undecidable) constraint layer to terms, so each term would have a predicate that should be true at run-time and the compiler could infer necessary constraints.

One can, of course, provide similar solution in Haskell.

I don't think so, as the solution in Nice uses subtyping which isn't available in Haskell (unless we use type-class tricks).

Is this fathful enough?

I don't think so, as the solution in Nice uses subtyping which isn't available in Haskell (unless we use type-class tricks).

Tricks? Aren't classes the official way to do subtyping polymorphism?

Anyway, I think open nature of Nice subtyping is not essential here, so we can pretend that nobody will extend (Nice) class List except Cons and Nil.
When subtyping is closed, dynamic dispatch can be seen as mostly syntactic sugar for case analysis on all subtypes.
Ah, and we do need to give Cons a higher status - we cannot declare we want to accept constructors in our function.

That amounts to something like:

data List a
  = Nil (Nil a) | Cons (Cons a)

data Nil a = NilC
data Cons a = ConsC a (List a)

tail :: Cons a -> List a
tail (ConsC _ list) = list

Probably the longest Haskell code I've written :)
It's a pity Haskell insists on unique constructor names (for type inference, no doubts), otherwise I could obfuscate this even more :)

Tricks

Your code is indeed a transcription of the idea in Haskell, once you suppose a closed world (fair enough in this case). However, you'll have to agree this is not very convenient to use: 'Cons a' is not a list, it can be wrapped manually into a list using the Cons constructor.

That said, I agree with the original post that this does not carry to arbitrary dependent types. I was merely pointing out that the motivating example was not hard enough to warrant them.