Ruling out nonsensical data type definitions

In a language that enforces a separation between data and codata, it seems like (with a Haskell-esque syntax):

data Silly = Very Silly

should be statically disallowed, since the resulting type Silly is uninhabited.

First, is this intuition correct? Or does defining types of this kind have some utility?

Second, what's the dual concept for codata? I think that if we are ruling out Silly as uninhabited, we should probably also rule out both of:

codata CoSilly = Quite CoSilly codata CoSilly2 = Just Int | Quite2 CoSilly2

for a related reason.

I'm thinking that the rule is something like this: A data definition must contain at least one alternative which is not recursive. A codata definition must not contain any alternative which is "only recursive" (meaning that it is a product of only recursive references to the codata type being defined).

Am I even close? I am unaware of the correct terminology here, because searching is getting me nowhere. I've read a few overviews of total functional programming, including Turner's "Total Functional Programming", and learned of other syntactically reasonable data type definitions that have to be rejected because the reintroduce bottom, but didn't find a discussion of types of this kind (which seem merely useless in the data case, but also seem like an enemy of productivity proofs in the codata case).

Silly Problems

data Silly = Very Silly

At one point I considered trying to disallow such types, much as you're doing right now. Then I thought: hey, if the type is uninhabited, it isn't going to cause any problems... other than looking Silly.

A data definition must contain at least one alternative which is not recursive. A codata definition must not contain any alternative which is "only recursive" (meaning that it is a product of only recursive references to the codata type being defined).

Any given codata value supports ALL of the destructors/methods for its type. In this sense, codata values are a bit like records or objects... minus object identity and effects.

Second, what's the dual concept for codata?

Nearly the same as for datatypes. You can't have every destructor return just the codata type. However, whereas in a datatype you need to have at least one constructor that is potentially non-recursive, with codata you need to have one destructor that is not completely recursive. It would be alright if even one destructor provided information other than the codata type, even if that something was recursive.

data Silly = Constr1 Silly | Constr2 Silly
codata CoSilly = destr1:CoSilly & destr2:CoSilly

data Okay = Constr (Maybe Okay) -- potentially non-recursive
codata CoOkay = destr:(CoOkay,Maybe CoOkay) -- not completely recursive

Unlike inductive datatypes, you likely could syntactically express construction of a CoSilly as an unfold from unit (or any other value). Still, it wouldn't actually cause problems... it isn't as though languages supporting codata should expect termination on any fold over them, and they'd do well to restrict them to unfolds.

By dmbarbour at Mon, 2009-02-02 01:45 | login or register to post comments

Right, but...

I agree that it isn't too important to disallow uninhabited data types, because they can't cause problems besides looking silly.

I'm actually more concerned about the codata situation. My (admittedly incomplete) understanding is that the dual of termination is productivity, which means something along the lines of "sure this list might be infinite, but you can always get the next element of it (or a demonstration that it is empty) in finite time." Is that not correct?

If it is correct, and you allow a type like:

codata CoSillyTree a = Leaf a | Branch (CoSillyTree a) (CoSillyTree a)

then what happens when someone writes:

oops :: a -> CoSillyTree a oops x = Branch (oops x) (oops x)

Up to my current understanding, this definition of oops is guarded corecursive, because each corecursive call to oops is inside a constructor of CoSillyTree, but it seems useless in a way analogous to non-termination. Disallowing this definition of CoSillyTree might prompt the user to see that using the following definition instead avoids this problem:

codata CoTree a = Nilt | Branch a (CoTree a) (CoTree a)

So, I'm asking, why not change your rule from "with codata you need to have one destructor that is not completely recursive" to "with codata you need to not have a destructor that is completely recursive"?

I think whatever I am missing here has to do with not quite understanding what you mean by "Any given codata value supports ALL of the destructors/methods for its type."

By Douglas McClean at Mon, 2009-02-02 02:41 | login or register to post comments

First you need to grok codata...

If it is correct, and you allow a type like:
codata CoSillyTree a = Leaf a | Branch (CoSillyTree a) (CoSillyTree a)

Supposing X is a value of codata type 'CoSillyTree T', X should be able to answer all the destructors of 'CoSillyTree', including 'X.Branch' and 'X.Leaf'.

Admittedly, talking about branches and leaves is a bit unusual for a codata type... more likely you'd say: 'X.left', 'X.right', 'X.value'

Chances are, you're not thinking destructors when you wrote about Leaf and Branch.

The beginning of your misunderstanding is, I believe, that 'or' bar '|'. Stick with using 'and' or '&' when describing codata. For example type Stream A = head:A & tail:Stream A. Using '&' will help you remember that codata is described by destructors, not by constructors.

I suspect that you are thinking about lazy inductive types rather than codata types. They are different things. You need to interleave codata and data to produce the equivalent potentially terminating streams. For example: type TStream A = head:A & tail:(Maybe (TStream A)).

what happens when someone writes:
oops :: a -> CoSillyTree a oops x = Branch (oops x) (oops x)

Type error. 'oops' is incomplete because it doesn't indicate what the 'Leaf' destructor applied to the result of 'oops' would return.

The dual of termination is productivity, which means something along the lines of "sure this list might be infinite, but you can always get the next element of it (or a demonstration that it is empty) in finite time." Is that not correct?

Codata does not dual termination. It duals construction vs. destruction. I've no clue what the dual of termination would be, but by guaranteeing termination you guarantee only that you'll return from every operation, not that you'll obtain useful information by doing so.

Folds over datatypes will pattern-match and destruct the datatype into component types. Unfolds over types will construct codatatypes.

I think whatever I am missing here has to do with not quite understanding what you mean by "Any given codata value supports ALL of the destructors/methods for its type."

Any closer to understanding it after this post? ^_^

Suppose for a moment that you had:

codata CoTree a = Node:(a,(CoTree a),(CoTree a))

That is, CoTree has one destructor that returns a triple (a,(CoTree a),(CoTree a)).

In this case, supposing X is a CoTree a, we could ask for X.Node, which would return a triple. Since we have a triple, we could ask for 'X.Node.2' to return the second value of that triple. Given termination, this would occur in finite time. Usefully, we could ask for something like X.Node.2.Node.3.Node.1, which would be the value reached by first following the node left, then following the node right, then asking for a value.

But it would be unusual to have a 'destructor' that returns a triple... it makes more sense to have three destructors, so one can just ask for X.left.right.value.

By dmbarbour at Mon, 2009-02-02 03:56 | login or register to post comments

Thanks for your patience

Thanks for the patient explanation.

I think I am much closer to understanding it now.

I'm still a little hazy on what the translation is between the concrete syntax used in section 4 of this paper by Turner, referenced elsewhere on LtU, which seems to use a means of expressing the type of possibly infinite lists more like my mistaken one. I assume a simple translation exists, but I can't quite figure out what it would be or how it would go around this problem.

Turner offers (on page 762) the example:

codata Colist a = Conil | a <> Colist fibs :: Colist Nat fibs = f 0 1 where f a b = a <> f b (a+b)

How should I mentally translate this constructor-style definition of a (potentially-)infinite list type into a destructor-style one? I understand your specific example of type TStream A = head:A & tail:(Maybe (TStream A)) but I can't figure out how to automate the translation (or whether/why/how the translation would need to fail when presented with a goofy definition like the one in the original misguided question).

It seems like the translation here needs to end up with something like:

data Colist a = L { hd :: a, tl :: Maybe (Colist a) } fibs :: Colist Nat fibs = f 0 1 where f a b = L { hd = a, tl = Just (f b (a+b)) }

but this is probably just as wrong as the rest of my thinking on this topic has been. Is there a textbook overview of these issues that I should be reading?

By Douglas McClean at Mon, 2009-02-02 04:41 | login or register to post comments

Turner's Notation

Is equivalent to having a single implicit destructor, destroy, that can return one value (which may be of any datatype, including a recursive one).

codata Colist a = destroy:(Nil | Cons(a,(Colist a))
codata ColistInf a = destroy:(a,(Colist a))

The single-destructor approach towards codata is common among people who approach codata after having worked with lazy languages (where 'codata' looks a lot like a weakened form of lazy structures), rather than those approaching codata after working with strict languages (where 'codata' looks a lot like methods on an abstract value-object). Using the single destructor makes it a bit easier to inject some extra 'or' types into the mix, since you'd really need to use 'Maybe (TStream A)' to get what Turner gets with 'Colist'.

It is worth noting that both Colist and ColistInf above are useful (non-silly) codatatypes. Also, since there is only one destructor the only 'Silly' type is any homomorphism of:

codata CoSilly = destroy:CoSilly

For example, not even these are silly:

codata CList = destroy:(X CList | Y Clist)
codata CList = destroy:(Maybe CList)

In the first case one could get information (effectively 0s and 1s) when destructing the CList. In the latter case, one effectively has a 'potentially infinite' integer... i.e. one can make a useful observation as to whether destroy returned Nothing or whether you'll need to examine further before (potentially) reaching Nothing.

Coming back to your examples:

codata CoSilly = [implicit destroy:](Quite CoSilly)

Assuming 'Quite' is a new constructor, this would indeed be silly. It's a homomorphism of the above CoSilly = destroy:CoSilly.

However, the other example...

codata CoSilly2 = [implicit destroy:](Just Int | Quite2 CoSilly2)

... is not silly. One can get information by observing data of this type, be it the integer or the fact that one hasn't reached the integer.

codata CoSillyTree a = [implicit destroy:](Leaf a | Branch (CoSillyTree a) (CoSillyTree a))

And this one is also not silly. It is possible that the tree has nothing but branches in it, but each Branch might have leaves (with values), so there can be a lot of information in this tree, and it is even information when you notice that there's another branch rather than a leaf.

However, a type for a completely recursive tree would be silly...:

codata CoSillyTree a = destroy:(Branch (CoSillyTree a) (CoSillyTree a))

This would be homomorphic to a type with only the destructors left:(CoSillyTree a) and right:(CoSillyTree a). It is also homomorphic to all other silly codata structures...

By dmbarbour at Mon, 2009-02-02 05:13 | login or register to post comments

Light dawns

David, thank you again, that is a great explanation.

By Douglas McClean at Mon, 2009-02-02 05:40 | login or register to post comments

Small remark

This data-type is inhabited. At least, in Haskell.

data Silly = Very Silly deriving Show

harmless:: Silly
harmless = Very harmless

By marco at Mon, 2009-02-02 09:53 | login or register to post comments

A quibble from the ignorant

As Lennart Augustsson corrected me, the Haskell standard does not specify how the language will interpret recursive declarations involving only constructors. ghc will create cyclic structures, so inhabiting the datatype, but other implementations might do other things and still conform to the standard.

I confess to being unable to find useful information in the Haskell 98 standard, such as: is a Haskell implementation required to reject programs that can't be given types, eg. (head 0)? And I don't know what the limits are on a conforming Haskell implementation when it comes to deal with examples like marco's above. Heap overflow seems like a good bet, or some representation involving futures seems very Haskell-like, but what about rejecting the well-typed declaration of harmless on the grounds that its dumb representation is infinite? I'd guess not, but since I can't find authoritative guidance in the standard...

By Charles Stewart at Mon, 2009-02-02 13:26 | login or register to post comments

Why would you reject it?

Silly as it may be, it seems a perfectly valid and usable type. One similar type could be used, say, in a Haskel-like implementation of the unix yes command where Yes would model the type of the structure being printed [ say yes, equivalent to harmless].

The standard just seems to be lacking.

[Ok, I thought we were discussing lazy languages.]

[Actually, the example also makes sense in an eager language which allows cycles.]

By marco at Mon, 2009-02-02 15:11 | login or register to post comments

Usable, but never Useful

Keep in mind that the 'Silly' type bears no information whatsoever. Any function you passed it into could compile out all checks related to Silly.

To be useful, a datatype must carry information. Asking a question to which the answer might be 'Yes' is only useful insofar as a different answer might be provided.

If you want a function that prints 'Yes' without end in a language that doesn't guarantee termination, you can do so without involving types.

By dmbarbour at Mon, 2009-02-02 15:19 | login or register to post comments

Well-foundedness

Your definition is I think equivalent to wanting all type declarations to be well-founded, or Noetherian, whatever you like.

So, yeah, I guess your assumption is right.

By marco at Mon, 2009-02-02 15:19 | login or register to post comments

Divergent Inhabitants

Being inhabited by a structure over which all observations are divergent is, I believe, the same as being uninhabited.

Anyhow, it is clear with the concerns over codata and mention of Turner's Total Functional Programming that the OP is not considering lazy types.

Laziness might still occur behind the scenes, possibly exposed to programmers as an optimization strategy, but would not be available in manners that change the semantics of the program (modulo divergence and error, all orderings for evaluating 'pure' functional programs are equivalent). Laziness could still exist meaningfully for effect types...

By dmbarbour at Mon, 2009-02-02 15:12 | login or register to post comments

Useful

To be useful, a datatype must carry information.

This is analogous to saying that, to be useful, a natural number must be greater than one, because any arithmetic involving just 0's and 1's is trivial!

That's arguably true when only writing expressions over constants. But when defining functions over the natural numbers, having 0 and 1 available is essential. The analogy is: For higher-order type constructors, trivial types are essential building blocks.

For example, consider a binary-tree "data Tree t = Leaf t | Node t". The tree carries information in two ways: in its leaves, and in its node structure. If you have a single-element datatype (call it Unit), then "Tree Unit" still carries useful information, and disallowing a Unit type in a language would be counterproductive.

A similar arguments can be made for allowing an uninhabited type. Consider "Maybe t = Just t | Nothing". If you have an uninhabited type (call it Null), then Maybe Null is still valid, and any value of this type is guaranteed to be Nothing. Such reasoning is useful for dealing with data polymorphically.

By Tim Sweeney at Mon, 2009-02-02 21:08 | login or register to post comments

Analogy needs explanation...

To be useful, a datatype must carry information.

This is analogous to saying that, to be useful, a natural number must be greater than one, because any arithmetic involving just 0's and 1's is trivial!

Sorry, I'm really not seeing the analogy. It seems to me more analogous to saying that, to be useful, the set of natural numbers cannot be restricted to containing only the value '0'.

If you have a single-element datatype (call it Unit)

Unit is not usually considered a 'data' type. It is used for synchronization. Any place you might use 'Unit' for data, you can optimize it right out. Unit carries no data and no information, excepting in communications scenarios, where it implicitly carries arrival order and arrival time.

There is reason to keep Unit in a language that does anything more than process data... but there is little reason to have more than one version of it, except insofar as it is perhaps more effort than it is worth to prevent there being more than one version of it.

tree carries information in two ways: in its leaves, and in its node structure

I don't disagree. It was noted above that 'data T = X T | Y T' is a useful, non-silly datatype. It is homomorphic to an infinite stream of 1s and 0s. It carries information.

A similar arguments can be made for allowing an uninhabited type. Consider "Maybe t = Just t | Nothing". If you have an uninhabited type (call it Null), then Maybe Null is still valid, and any value of this type is guaranteed to be Nothing. Such reasoning is useful for dealing with data polymorphically.

Agreed. I can see how use of uninhabited types to trim out certain tagged unions from a collection (of more than just the two you name) would be of some utility. It could lead to some optimizations if the compiler knows the type is uninhabited.

I would suggest using a special type for that purpose, such as 'Unit' for synchronization. Perhaps 'Void' for the empty type.

By dmbarbour at Mon, 2009-02-02 23:13 | login or register to post comments

Analogy

To be useful, a datatype must carry information.

This is analogous to saying that, to be useful, a natural number must be greater than one, because any arithmetic involving just 0's and 1's is trivial!

Sorry, I'm really not seeing the analogy. It seems to me more analogous to saying that, to be useful, the set of natural numbers cannot be restricted to containing only the value '0'.

Think of cardinalities, the number of distinct values which inhabit a type. The uninhabited type has cardinality 0, and the Unit type cardinality 1. The cardinality of (a,b) is the cardinality of a times the cardinality of b. The cardinality of "data T = X a | Y b" is the cardinality of a plus the cardinality of b. To say two types have equal cardinality is equivalent to saying the types are isomorphic. Anything you can say about finite types, you can equivalently say about natural numbers, and vice-versa; and 0 and 1 are certainly useful natural numbers!

Any place you might use 'Unit' for data, you can optimize it right out.

It can be optimized away, but its existence is still important. For example, consider an eager language like ML, where every function has type t->u for some types t and u. A Unit type enables you to represent a function that takes no parameters, using the same syntax and semantics as functions that take a parameter.

C++ suffers from a silly usability flaw because its void "type" isn't first-class. So, for example, you can use a template function to abstract over all functions taking a parameter of ordinary types, but such a template doesn't work for functions with no parameters.

Also consider a language that allows bounded natural number types. You can represent an array of length n as a function from the natural numbers less than n to its range type, and then access elements using a(i) syntax. You can do this in Haskell using an inductive construction of the natural numbers.

If the language disallows uninhabited or unit types, then zero-element arrays and one-element arrays can't be expressed!

By Tim Sweeney at Tue, 2009-02-03 06:13 | login or register to post comments

But void is unit?

(This is probably just as wrong as everything else I have said so far on this thread, so please take this in the spirit of a request for clarification about why this line of thinking is misguided.)

But the C++ void (psuedo-)type isn't an uninhabited type, it's a (the) unit type?

I do take your point about the utility of phantom types as arguments to type constructors, but I wonder if it might not be possible to exclude them from the kind *.

For example, is it possible to define KNat as the kind of phantom types representing type-level naturals, and give kinds like TSucc :: KNat -> KNat and Vector :: KNat -> * -> *, to prohibit uninhabited types from appearing where we expect "normal" types but still have them were we want them?

By Douglas McClean at Tue, 2009-02-03 07:04 | login or register to post comments

In C++, 'void' is a Unit,

In C++, 'void' is a Unit, almost. There are some exceptions, such as the inability to pass 'void' as an argument to a multi-argument function.

The 'Void' discussed here would be a type that is uninhabited. If a function required 'Void' as an input, that function could never be called.

By dmbarbour at Tue, 2009-02-03 15:17 | login or register to post comments

C++ void

Since we can write "int f(void) {..}", call f(), and use its result, C++'s void "type" is unit-like: it's inhabited but doesn't contain any information. If void were uninhabited, then we'd be unable to call f, because neither we nor the compiler could produce a value to pass to f.

Some dependent type systems make it possible to define new kinds and limit them by cardinality, for example "the kind of all types with two or more values". In a proofs-as-programs language like Coq, this can be expressed as a tuple containing a type, two elements of that type, and a proof that the elements are unequal.

By Tim Sweeney at Tue, 2009-02-03 15:31 | login or register to post comments

Cardinalities

The cardinality of (a,b) is the cardinality of a plus the cardinality of b.

*confused* - do you mean the cardinality of a times the cardinality of b?

Anything you can say about finite types, you can equivalently say about natural numbers, and vice-versa; and 0 and 1 are certainly useful natural numbers!

Sorry, still not catching you there. If we were considering 0 and 1 as types for uninhabited and unit respectively (and 2 would be boolean) it would really be difficult to say that '0 and 1' are useful data types... not unless we start doing advanced computations with some sort of type calculus.

A Unit type enables you to represent a function that takes no parameters, using the same syntax and semantics as functions that take a parameter.

Unit, when used in initiating 'procedures', operates as a synchronization type, not a data type. I've said already that unit has value as a synchronization primitive. I'm just careful to not call unit a 'data' type. (I vehemently disagree with language design decisions that unify procedures and functions... but that's for another discussion.)

If the language disallows uninhabited or unit types, then zero-element arrays and one-element arrays can't be expressed!

A zero-element array would be expressed as unit, not as an uninhabited type. The bounded natural number would be zero.

By dmbarbour at Tue, 2009-02-03 15:37 | login or register to post comments

Unit nits

Unit is not usually considered a 'data' type.

Is that true? Considered by whom?

Do you not consider Unit to be acting as an (empty) datatype in the Tree example Tim gave?

data Tree t = Leaf t | Node (Tree t) (Tree t)
data Unit = Unit -- If we get an Error here...
type BareTree = Tree Unit -- ... then we can't do this, which is useful

In a strict language, functions from Unit can be used to explicitly represent delayed values, as in:

 data Stream t = Stream (t, Unit -> Stream t)

Is that what you mean by "synchronization"? Again, is this standard usage of that term?

By Matt M at Tue, 2009-02-03 16:26 | login or register to post comments

'Data' has a denotation in

'Data' has a denotation in reference to information, and a data type is thus one that can bear information... i.e. something you could stick in a column of an RDBMS. In this sense, unit should not be considered a 'data' type. I certainly do not consider it one. And, considering that I've never seen a language that defines unit as anything other than a 'primitive' type (as opposed to a 'data' type), I'd think most people aren't calling it a data type.

But perhaps I'm in error... absent contradiction, weak evidence and confirmation bias and simple reasoning have the potential to shape my worldview. Do you consider 'unit' to be a 'data' type?

Do you not consider Unit to be acting as an (empty) datatype in the Tree example Tim gave?

In Tim's example, Unit was simply a way of reducing:


  data Tree t = Leaf t | Node (Tree t) (Tree t)

to the equivalent of


  data Tree = Leaf | Node Tree Tree

... which is a data type.

[I'll note that you've since edited your example to make the reduction clearer.]

I can see such reductions in the calculus of producing types as useful (similarly you could reduce variants by using 'Void'). But it doesn't mean that Unit or Void are data types.

In a strict language, functions from Unit can be used to explicitly represent delayed values [...] Is that what you mean by "synchronization"? Again, is this standard usage of that term?

Delayed values I would generally represent by something other than functions over unit. But that certainly is an example of synchronization. Unit can also be passed between actors in an actor model, or tossed into a dataflow to initiate behaviors, or used to represent events in a reactive program.

Any message implicitly carries with it arrival order. When used for such communication effects, 'synchronization' is appropriate and is the word I've seen used in the Oz/Mozart literature I've been perusing, among other literature.

By dmbarbour at Tue, 2009-02-03 17:14 | login or register to post comments

I consider "datatype" in

I consider "datatype" in this context as short for "algebraic datatype", of which Unit is a simple example with a single unparameterized constructor. The point of Tim's example was, I think, that when you have a datatype parameterized by another datatype, it can be useful to take the parameter to be Unit or Null. It sounds like you don't disagree and that this is just a terminology issue.

[I'll note that you've since edited your example to make the reduction clearer.]

Hmm, I haven't edited that post. Maybe you read it differently the first time through?

The way I see it, the utility of Unit exists at a very core language kernel consisting purely of values and types, and so I see actors, message passing, and synchronization as red herrings.

By Matt M at Tue, 2009-02-03 19:23 | login or register to post comments

Paramaterization

I consider "datatype" in this context as short for "algebraic datatype", of which Unit is a simple example with a single unparameterized constructor.

Unit is a degenerate example. It's also a degenerate example of a record or piece of codata with no features. Degenerate examples often need special consideration.

The way I see it, the utility of Unit exists at a very core language kernel consisting purely of values and types, and so I see actors, message passing, and synchronization as red herrings.

... don't forget those evaluative forms. Unit wouldn't have much utility in a language where all you could do is write it in by hand.

Unit type makes a useful special type parameter, and unit value makes a useful synchronization primitive. No red herring there.

If you have codata, unit also makes a useful target for 'unfold', somewhat similar to your 'Stream' example but without the implication that functions over unit are somehow different (e.g. divergent, subject to error, or capable of initiating effects) from the plain old values they return.

Hmm, I haven't edited that post. Maybe you read it differently the first time through?

That must be the case, then.

By dmbarbour at Tue, 2009-02-03 19:51 | login or register to post comments

If it's a poll...

I would certainly consider Unit to be a data type. But I roughly consider "type" and "data type" synonymous in most languages. If you want to distinguish some subset of types as "data types", I guess you can do that in a principled way, but I don't think it's a standard usage.

By Matt Hellige at Tue, 2009-02-03 21:07 | login or register to post comments

I roughly consider "type"

I roughly consider "type" and "data type" synonymous in most languages.

effect types (delay, error, communication), interfaces and object behavior, contracts and precondition/postcondition descriptions, region inference and linear or sub-structural types, codata types, trust/security and phantom capability types, ...

I'm not certain how anyone who keeps up with PLT could come away considering 'type' and 'data type' to be the same. It seems a sizeable bite of PLT is type theory.

By dmbarbour at Tue, 2009-02-03 21:16 | login or register to post comments

Not a total stranger...

I'm aware of all those types of types, and I agree that most of them should not in a certain sense be called "data types," but that's not exactly what I had in mind...

I guess I just wouldn't be inclined to use the phrase "data type" at all, outside of a very specific comparison (e.g., data type vs codata type in particular formal setting). When I encounter the phrase, it's usually being used by a layperson as a synonym for type, and I guess that's really what I had in mind. I believe the average programmer thinks that a type must be a type of something, and in programming, that something is "data." I realize that this is not a theoretically nuanced view, but I think it's fairly common. Witness the fact that in Haskell, all types are declared with "data"... I suppose this may bother you more than it bothers me.

But even if you restrict data type to mean "an inhabited type specifying data," I'd still certainly call Unit a data type. I just don't see any reason at all to disqualify it, and it plays an important role in theories of data.

As I said in my last post, I think I do recognize the distinction you want to make, I just think that "data type" is an overloaded phrase, and it's more often used in a broader sense.

By Matt Hellige at Tue, 2009-02-03 21:39 | login or register to post comments

Also...

Also, note that I said "in most languages". I think it's fair to say that most languages do not have the rather exotic types that you mentioned.

By Matt Hellige at Tue, 2009-02-03 21:40 | login or register to post comments

That's fair.

note that I said "in most languages"

That's fair.

if you restrict data type to mean "an inhabited type specifying data," I'd still certainly call Unit a data type.

And I will not. If the values of a given digital value type can be fully described in 0 bits, then I'm going to refuse to give it a name that implies it carries information.

By dmbarbour at Tue, 2009-02-03 21:57 | login or register to post comments

I'm reluctant to perpetuate

I'm reluctant to further perpetuate this moderately mind numbing sub-thread, but if the test is "carries information," it looks like Unit should still be counted. It just carries 0 bits of information. Bits(Bool) = 1. Bits(Unit) = 0. Bits(IOEffect) = type error.

By Matt M at Tue, 2009-02-03 22:25 | login or register to post comments

The test is "does it carry

The test is "does it carry information?" not "does it make sense to ask whether or not it carries information?".

Digital structures that can be persisted or communicated in zero bits carry no information, no data, at all.

And, obviously, types (like IO Effect) where asking whether they carry information doesn't even make sense, are also not data types.

By dmbarbour at Tue, 2009-02-03 22:34 | login or register to post comments

What sort of logic is that?

What sort of logic is that?

The point was that for effect types and other exotic types, it doesn't even make sense to ask how much information they carry. For Unit, it does make sense and the answer is 0. [Nevermind - your edited your post addresses this point (though sadly doesn't reference Santa Claus anymore)]

BTW, one more interesting consequence of your definition is that in a language with sufficiently expressive types, it becomes undecidable whether a given type is a datatype or not. Maybe we need a new keyword like "data??" for this situation... ;)

By Matt M at Tue, 2009-02-03 22:54 | login or register to post comments

What's wrong with that codata?

You might want to statically reject ill-founded inductive types, but I don't see the sense in your proposed restrictions on codata.

Codata can be infinite and built by unrestricted recursion, so the only way a codata type can be uninhabited is if each constructor requires an argument of an uninhabitable type.

With mutual definitions that's a least fixed point:

codata A = A B
codata B = B A

is inhabited by

let a = A (b a)

It might be good to warn about uninhabitable inductive definitions in favor of some special syntax for empty types, but this sort of codata is inhabited so the most I think you would want to do is optimize it out.

By Brandon Moore at Tue, 2009-02-03 20:28 | login or register to post comments

Agreed, optimize out is the way to go...

As I said above, it ain't worth the effort to recognize silly types just so you can complain to the programmer. If you're going to bother recognizing them, you might as well optimize types of cardinality 0 and 1 out of both system and memory, except where used in a manner that cannot be removed (such as use of 'unit' for message passing and dataflow synchronization).

No reason to complain about something that isn't an actual problem, especially when it can be handled elegantly in an automated manner.

By dmbarbour at Tue, 2009-02-03 20:59 | login or register to post comments

Maybe complain about void

In a total language, if a type is uninhabitable, then any code that purports to receive or construct values of that type will just be dead. It might be helpful to warn about those types, and provide special syntax like

Inductive data : T :=;

in Coq, or Haskell's

data Void

Of course when void types arise from functor instantiation or something that's perfectly useful (ruling out certain kinds of leaves in a tree, maybe), and if you can even detect those cases the thing to do is optimize.

I agree that trivial codata is something to optimize out rather than complain about - you can have a perfectly useful function that destructures a value in such a type as it goes, and change the type to something nontrivial.

By Brandon Moore at Tue, 2009-02-03 22:36 | login or register to post comments

Browse archives

Active forum topics

New forum topics

Ah, cancel that request.
27 weeks 1 day ago
.
27 weeks 1 day ago
First-class link?
27 weeks 1 day ago
Video Presentation
49 weeks 3 days ago
Also published in ICFP
1 year 1 week ago
About identifiers...
1 year 3 weeks ago
Duplicate comment deleted.
1 year 3 weeks ago
Also true for postfix
1 year 5 weeks ago
That's all about the delimiters.
1 year 10 weeks ago
How about prefix notation?
1 year 10 weeks ago

User login

Navigation