## The Cat Language Kind System

My latest attempt to describe the Cat kind system with some kind of rigour is at http://www.cat-language.com/paper.html.

Here is an excerpt from the kind system description:

  kind ::==
| type            the kind of all types
| stack           the kind of all stacks

type ::==
| a               type variable
| var             variant
| int             integer
| bool            bool
| string          string
| list            list
| ∀ ρ . (stack, ρ) -> (stack, ρ)   function

stack ::==
| A               stack variable
| nil             empty stack
| type, stack     stack with a single type on top
| stack, stack    stack with a stack on top
| stack \ stack   stack difference operation

I'd be exceedingly greatful for any comments on the paper.

## Comment viewing options

### The stack difference

The stack difference operation makes no sense to me. Here's your definition:

    A : (A0, A1, ..., Am, Ï)
B : (B0, B1, ..., Bn, Ï)
A \ B =
m < n  | (Am, Am+1, ..., An)
m >= n  | nil


What is Am+1 supposed to be? Perhaps the definition should be as follows?

    A : (A0, A1, ..., Am, Ï)
B : (B0, B1, ..., Bn, Ï)
A \ B =
m > n  | (An, An+1, ..., Am)
m <= n  | nil


As for your kind system, I don't understand it enough to make any useful comment. An example of type inference for a complex term might help to clarify things.

### Good catch!

Good catch! It is in fact supposed to be ...

    A : ∀ ρ . (A0, A1, ..., Am, ρ)
B : ∀ ρ . (B0, B1, ..., Bn, ρ)
A \ B =
m < n  | ∀ ρ . (Bm, Bm+1, ..., Bn, ρ)
m >= n  | ∀ ρ . (ρ)

Thanks for pointing that out, I've updated the page.

Because if A is nil then m is -1, right?

### Good catch

Yes I believe you are correct, thank you for pointing that out.

### Pain points?

As for your kind system, I don't understand it enough to make any useful comment. An example of type inference for a complex term might help to clarify things.

Are there any particular pain points you can identify? I fear that using a complex example, might not help.

Nonetheless, here is an example:

define f { [dup 10 <] [inc] while }
f : (int)->(int)


Does that help? Or are you talking about something even more sophisticated? Do you have any examples?

define f { [dup 10 <] [inc] while }
f : (int)->(int)

That's a good start. Now, I can see why f would have the (int)->(int) type - the example you've chosen is quite intuitive. One of the purposes of a type (or kind) system, however, is to enable automatic type derivation. So, how would you infer the type of f without using your intuition, but the rules of the kind system alone? Start with your basic definitions

 dup : (a:any) -> (a a) 10 : () -> (int) < : (int int) -> (int) inc : (int) -> (int) while : ((A:any*)->(A) (A bool)->(A) A)->(A) f : (A)->(B) => [f] : ()->((A)->(B)) g : A -> B, h : C -> D => f : (A C\B) -> (D B\C) 

and write down the entire proof leading to f : (int)->(int), with no leaps of imagination. Then invite us again to try to find a hole in the proof. That's the way it works in mathematics, anyway: you can't just give a set of axioms and theorems and say "hey, I think I have a cool theory here, is there anything I'm missing?", you're supposed to provide proofs leading from axioms to theorems.

If you can correctly infer the type of at least one relatively complex expression, it's already pretty likely that your kind system is useful. The next step would be proving that the system is complete and sound.

By the way, the type of while looks weird. Shouldn't the condition argument produce a Boolean instead of consuming it?

### Thanks for the tips

By the way, the type of while looks weird. Shouldn't the condition argument produce a Boolean instead of consuming it?

You are correct.

Below is my attempt to walk through the steps of arriving at the typing using the semantic rules of Cat. I have varied a bit from the paper, in the hope of having made improvements. I wonder how close I am to actually showing some kind of proof, and if I need to show even more detail?

The example in the following program is a bit silly, it takes an integer and uses a while loop to increment it to 10. If the number is greater than 10, the program never ends.

define silly { [dup 10 <] [inc] while }

// definitions

dup : (a:any) -> (a a)
10 : () -> (int)
< : (int int) -> (int)
inc : (int) -> (int)
while : ((A:any*)->(A) (A bool)->(A) A)->(A)

// rules

1. f : (A)->(B)
⇒ [f] : ()->((A)->(B))

2. g : (A)->(B), h : (C) -> (D), f = g h
⇒ f : (A C\B) -> (D B\C)

Some of the implications of this rule:

2a. g : ()->(), h : (A)->(B), f = g h ⇒ f : (A) -> (B)
2b. g : ()->(A), h : (A B)->(C), f = g h ⇒ f : (B) -> (C)
2b. g : (A)->(B), h : ()->(), f = g h ⇒ f : (A) -> (B)
2c. g : (A)->(B), h : (B)->(C), f = g h ⇒ f : (A) -> (C)
2d. g : (A)->(B C), h : (B)->(D), f = g h ⇒ f : (A) -> (D C)
2e. g : (A)->(B), h : (B C)->(D), f = g h ⇒ f : (A C) -> (D)

3. g : (A)->(B), h : (C) -> (D), f = g h
⇒ Σ i=0..min(len(B), len(C)) Bi <: Ci

4. ∀a . a <: Prim ⇔ a = Prim

5. f : (A)->(B), g : (C)->(D), f <: g
⇒ A <: B, D < B

6.  A <: B
⇒ (a01 ... am) <:
(b0 b1 ... bn)
⇒ Σ i=0..min(m,n) ai <: bi

7. (a01 ... am,
am+1, ..., an)
<: (b0 b1 ... bm, X:any*)
⇒ Σ i=0..m ai <: bi, X = (am+1 ... an)

// derivations

// apply rule 1.
[inc] : ((int)->(int))

// rewrite
10 < : () -> (int) (int int) -> (bool)

// apply rule 2b.
10 < : (int)->(bool)

// rewrite
dup 10 < : (a:any) -> (a a) (int) -> (bool)

// apply rule 2 and rule 3
dup 10 < : (a:any) -> (a bool), a <: int

// apply rule 4
dup 10 < : (int) -> (int bool)

// apply rule 1:
[dup 10 <] : ()->((int) -> (int bool))

// rewrite
[dup 10 <] [inc] while : ()->((int) -> (int bool))
((int) -> (int)) ((A:any*)->(A) (A bool)->(A) A)->(A)

// simplify
[inc] while : ()->((int)->(int)) ((A:any*)->(A) (A bool)->(A) A)->(A)

// apply rule 7
A = (int)
[inc] while : ()->((int) -> (int)) ((int)->(int)
(int bool)->(int) int)->(int)

// apply rule 2b
[inc] while : ((int bool)->(int) int)->(int)

// rewrite
[dup 10 <] [inc] while : ()->((int) -> (int bool))
((int bool)->(int) int)->(int)

// apply rule 2b
[dup 10 <] [inc] while : (int)->(int)


Is this the kind of thing you would expect in the paper Mario? How could I improve on this to make it more standard, and easier for people to follow?

Thanks again!

### Thank you

Is this the kind of thing you would expect in the paper Mario?

Yes, that's perfect. I've glanced at your proof and I don't see any obvious problems.

How could I improve on this to make it more standard, and easier for people to follow?

I agree with Matt's comments. The natural deduction style is well-established, not to mention easier to follow, and he's probably right about the argument order as well: I was also confused by it at first. Another problem with your notation is that your "->" binds tighter than juxtaposition, which looks very confusing in terms like

   ((A:any*)->(A) (A bool)->(A) A)->(A)


I understand that your notation is more consistent than Forth's, but you might consider at least replacing the function type notation (A1 A2)->(B1 B2) by (A1 A2 -> B1 B2). It's just syntactic sugar, but I feel it would greatly help readability. Contrast the above with

   ((A:any* -> A) (A bool -> A) A -> A)


I don't have any more substantial comment right now. I'll try to find some time to analyze your modified system tomorrow or over the weekend.

### Great idea

I think the syntactic rewriting of function types is a great idea!

I don't have any more substantial comment right now. I'll try to find some time to analyze your modified system tomorrow or over the weekend.

I will be rewriting the type rules using the natural deduction style and adding some more examples. I'll post when it is ready, probably not until the weekend.

### T: (A) -> (B, C)

Ok, I revisited your rules and this time it made much more sense to me! One comment (and I may be overlooking something) is that it seems that you could simplify your explanation by splitting the rule for composition into two different rules: C1 (details below) handles composition when the stack contains the parameters of your function, while C2 handles composition when all of the existing stack is consumed.

This using your conventions I think: T is a term, f is a function, a,b are types of functions, A,B,C,D are types of stacks of functions. I realized when I was done writing this that I had reveresed the ordering of your stack types (I think). I have the type of 10 'c' is (int, char), so that terms are in the same order as types. I'm not really sure why you chose the opposite order (maybe it makes more sense in some cases in light of C2?)

Anyway, here are the details:

	 T: (A) -> (B, C)       f: (C) -> (D)
-------------------------------------- (C1)
T f: (A) -> (B, D)

T: (A) -> (B)       f: (C, B) -> (D)
-------------------------------------- (C2)
T f: (C, A) -> (D)

T: (A)->(B)
--------------------- (Q)
[T]: ()->((A)->(B))

So, here's a derivation of the example that you gave:

dup: (a) -> (a, a)    10: () -> (int)
--------------------------------------- (C1)
dup 10: (a) -> (a, a, int)     <: (int, int)->(bool)
------------------------------------------------------- (C1)
dup 10 <:  (int) -> (int, bool)                          inc: (int)->(int)
------------------------------------------ (Q)      ----------------------------- (Q)
[dup 10 <]: () -> ((int) -> (int, bool))              [inc]: () -> ((int)->(int))
--------------------------------------------------------------------------------- (C1)
[dup 10 <] [inc]: () -> ((int) -> (int, bool), (int)->(int))               while: (A, (A)->(A, bool), (A)->(A)) -> A
---------------------------------------------------------------------------------------------------------------------- (C2)
[dup 10 <] [int] while: (int) -> (int)

I just noticed your reply to Mario (should have refreshed when I came back to post). My post is a Natural Deduction style derivation, which is pretty standard way of showing the type rules.

### Ok, I revisited your rules

Ok, I revisited your rules and this time it made much more sense to me!

It is gratifying to know that the logic behind my system is not impenetrable or non-existant. :-)

I should point out that I just reposted with some changes. For example I dropped the notion of "reverse subtyping".

One comment (and I may be overlooking something) is that it seems that you could simplify your explanation by splitting the rule for composition into two different rules: C1 (details below) handles composition when the stack contains the parameters of your function, while C2 handles composition when all of the existing stack is consumed.

The problem here is that the following is legal:

  f : (A)->(B)
g : (C)->(D)
h = f g

If B <: C. The idea of term elimination is correct in part, but is not complete. This is however my own fault because the rules are still incomplete, and my notation is still too far off the beaten path.

I am going to rewrite the rules using the natural deduction style.

I have the type of 10 'c' is (int, char), so that terms are in the same order as types. I'm not really sure why you chose the opposite order (maybe it makes more sense in some cases in light of C2?)

This is a conundrum for me. I have flip-flopped between representing the stack both ways. My rationale for the current approach is that it makes sense to enumerate the items on the stack starting from the top since most stacks we are interested in are infinitely big, and we only care about the top. Given that, it makes sense to me to represent a stack as follows:

A = (a0, a1, ...)

This is why the type annotation reads:

f : (a0 a1)->(b0 b1)

However if we have two nullary functions together we get:

f : ()->(a)
g : ()->(b)
f g : ()->(a b)

Which is somewhat counter-intuitive, but maybe less so when you consider that the Cat operations run in opposite order you might expect: { 2 3 + } versus (+, 2, 3) in most functional languages.

I really don't know what the right answer is. Is it a big stumbling block? Should I use one annotation for the language and another for expressing the type rules? Or perhaps I should simply use the Forth style stack diagrams?

The more I ponder it, the more I think that following your advice, and keeping in line with the Forth tradition may be the correct way to go.

### The problem here is that the

The problem here is that the following is legal:

f : (A)->(B)
g : (C)->(D)
h = f g

If B <: C. The idea of term elimination is correct in part, but is not complete. This is however my own fault because the rules are still incomplete, and my notation is still too far off the beaten path.

Hmmm, well the following should also be a rule (it's almost implicit with the stack notation):

	    T: (A) -> (B)
--------------------
T: (C, A) -> (C, B)


Can you give an example of a term that these rules fail to type? That would help me understand the problem.

### Before that

I am not sure I have a failing example yet. Since I am not well-versed in natural deduction, I don't know how you arrived at the following:

   dup: (a) -> (a, a)    10: () -> (int)
--------------------------------------- (C1)
dup 10: (a) -> (a, a, int)  <: (int, int)->(bool)


How did you arrive at the (int,int)->(bool) axiom?

Also did I provide sufficient rules to cover the replacement of "a" variable with the concrete type "int"?

### Sorry, the <: is less-than

Sorry, the <: is less-than followed by a colon (not subtyping). I was simply assuming that the less-than operator takes a stack of two integers, and returns a boolean. I should have put more space between "dup 10: (a) -> (a, a, int)" and "<: (int, int)->(bool)", as those are the two premises for the next deduction.

I just implicitly unified the types in the derivation. Note: The first three rules I gave labelled C1, C2, and Q are the definitions of those rules. In the derivation that followed, they were being applied.

Here's the definition of the rule C1:

	 T: (A) -> (B, C)       f: (C) -> (D)
-------------------------------------- (C1)
T f: (A) -> (B, D)


So for an example of type unfication look at the next deduction:

          dup 10: (a) -> (a, a, int)     <: (int, int)->(bool)
------------------------------------------------------- (C1)
dup 10 <:  (int) -> (int, bool)


In this case, to match the pattern of the rule C1, we want f = <, so we have to have C = (int, int) and D = bool. Then the only way (a, a, int) can match (B, int, int) is if a = int and B = int. Since a = int, we also have A = int. With this choice of variables we have matched the pattern and are entitled to deduce the conclusion of the pattern with appropriate substitutions.

In the derivation of the type of "dup 10", I was applying rule C1 to the types of dup and 10. In this case, there is no unification needed.