In which sense Joy is functional?

I am confused. I understand that the language Joy is regarded as a functional language. It also seems that is closer to Backus' FP than, say, Haskell.

The Joy FAQ (http://www.latrobe.edu.au/philosophy/phimvt/joy/faq.html) states that it is functional because expressions are functions operating on a stack, and juxtaposition is function composition. That definition feels (to me) like cheating.

Could not one also define a language with assignments and call it "functional", intepreting expressions as pure functions operating on an environment?

I mean, every imperative language can be implemented with the State monad, and that doesn't make it functional (I think).

Perhaps my question makes no sense, and is just wordplay. I'd really like to know what you think.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Answering with a question

I think that if we interpret C's statements as pure State functions, it would indeed make C a pure language. Generalizing, perhaps the set of imperative languages is a subset of pure languages. This flips the question around: instead of asking whether Joy has what it takes to be considered functional, I ask: does Joy have what it takes to be considered imperative? What is the criterion for being imperative?

Joy is functional

Joy is not purely functional; its analogue of a monad is a word that pushes some of its arguments back onto the stack after it is through with them (i.e. read, which accepts an i/o channel on the stack and returns the same (modified, under the covers) i/o channel and the value read). However, there is no state monad outside the stack itself, which can be understood as a compound value in the "concatenation is composition" interpretation.

"The ability to move things around"

I got into a discussion about this earlier in the context of a language called "Cat" (a Joy-inspired language): http://lambda-the-ultimate.org/node/2761#comment-41180.

Normally, purity gives you easy-to-follow dataflow, which allows you to reason about small chunks of code in isolation. This makes code easier to understand and modify. Twisting the definition of your language to implicitly thread through all kinds of state may allow you to call your language "pure", but you've lost the clear dataflow and local reasoning.

Maybe a more useful metric would be: "How precisely are a function's inputs and outputs specified?" A function in Joy that simply pushes a single value on the stack needs no inputs. But, semantically, it takes the entire stack as input, making the specification very imprecise. Similarly, a Haskell program that fakes state by threading a huge structure through all its functions is probably being less precise than it could be.

(BTW, "functional" is an overloaded word. In the context of the "purely functional", it has to do with side-effects. In the context of "functional programming", I think it has more to do with functions as first-class values. So when I'm talking about side-effects, I try to use the word "pure" instead of "functional". Pure is an overloaded word too, but not so much in this specified field.)

Given a "suitable" concatenative language

Given a "suitable" concatenative language, it is trivial to construct a dataflow graph for any program. In order to be suitable, a type system is realistically going to be necessary. First order functions can then be restricted to having constant stack effects (e.g. Int Int -> Int or String Int -> Char). Higher order functions can be restricted in various ways to make analysis easier, but may require more information about their arguments in order to determine the stack effect.

In the language I'm working on, functions are divided into first order "functions" (with constant stack effects) and second order "functionals", the latter of which can only take functions as arguments. This is similar to Backus's FP. The language is designed such that, for any function (be it simply a normal first order function or a first order function resulting from the instantiation of a functional), the stack effect will be knowable.

It might seem that this is simply a deficiency of concatenative languages, but part of the problem results from the fact that concatenative languages allow combinators not easily expressible in other languages. For example, if the function given to 'map' is allowed to access the entire stack, then essentially you have a generic version of Haskell's 'mapAccum' that can work with any number of values (instead of having to construct and deconstruct tuples). Of course, with a type system, 'map' can be appropriately restricted such that the function cannot access more than the top element of the stack. In such a case, it is comparable to Haskell's map, and rules like '[A] map [B] map == [A B] map' (map fusion) hold as expected.

Finally, to address your issue of data being "hard to move around", I can say that this is something you quickly get very good at. That said, it is possible to extend a concatenative language to allow the naming of values by enabling the "lifting" objects to the function level. For example, in my language the 'swap' function can be expressed as 'λ x y. y x', or alternatively, as a top level definition with 'x y swap = y x'.

Restricting functionals

I puzzled over the first two paragraphs above this morning, and I've puzzled over them again now.

Is there a particular problem you've identified with the Hindley-Milner-style row typing that Cat uses? If there is, could you explain it a bit more concretely? If not, then what advantage does your apparently more restrictive type system have?

Justification

I'm going to try and keep this as brief as possible. If you find it too brief, please don't hesitate to ask for further clarification.

Before I can talk about why I'm proposing a second order alternative to Cat's higher order system, I first need to extend Cat's type system. The reason for this is that Cat's system, as it stands now, is not currently usable.

Let's say, for example, that we want a function in Cat called 'make_hash'. Given a function that compares two elements for equality, it returns a hash table. Unfortunately, there is currently no way to give such a function a type in Cat. This is the best we can do (where uppercase words are row variables, lowercase words are scalar variables, and title-case words are base types):

   make_hash : A [B c c -> B Bool] -> A (Hash c d)

(Actually, we cannot even do this as Cat does not offer parameterized types, but that deficiency is not of interest here.)

Why is this type insufficient? The problem is '[B c c -> B Bool]'. We want to pass a function that consumes two values of the same type and yields a boolean such as '[Int Int -> Bool]' or '[String String -> Bool]'. However, the type given will also allow us to pass '[Int String String -> Int Bool]'! Essentially, because *all* functions are Cat are row polymorphic, there is no way to enforce that a given function consumes *no more* than a certain number of arguments.

(Important point: Cat allows row variables to be omitted in type signatures in certain situations. This is just a shorthand for the actual type with row variables.)

The way to solve this problem is to allow functions to not be row polymorphic. Here is the actual type we want:

   make_hash : A [0 b b -> 0 Bool] -> A (Hash b c)

Here, '0' represents the bottom of the stack. A '0' can only be unified with another '0' or with a row variable (e.g. 'A'). It cannot be unified with a vector of a row variable and one or more scalars (e.g. 'A Int').

This is a straightforward change to the type system, and one that Cat will have to make if it is to ever be usable. For the remainder of this post, I'll refer to this hypothetical language as "Cat0".

Alright, so now that we have Cat0, I can demonstrate some of its deficiencies that I'm trying to address in my language named "5th".

The first problem is that certain functions in Cat0 have weird types inferred for them. As a simple example, take the function 'twice' that simply calls a function on top of the stack twice. We'd write this in Cat0 as such:

   define twice { dup dip apply }

Essentially, 'twice' duplicates the function, calls it on the stack below the original copy of the function, and then calls the original function. In Cat0 (and Cat), the following type is inferred:

   twice : A [A -> A] -> A

Now this type actually is not so weird. It allows you to do things like '[dup *] twice' without issue. However, you are unable to provide a function to 'twice' which does not have the type '[A -> A]'. For example, you may want to do '[pop] twice', but the type system will stop you.

Here's the same function in 5th (functionals take functions as arguments in square brackets):

   twice[F] = F F

The following type is inferred:

   twice[A -> B /\ B -> C] : A -> C

Here, we have an intersection type. This type is much less restrictive as 'twice[F]' will always receive the same type as 'F F' for any 'F'. In other words, if 'foo foo' type checks, then so will 'twice[foo]'. This restores the compositional nature of concatenative programs lost in Cat's type system. (This is a point worth further addressing, including the fact that composing 'a b' with 'c' is not the same as composing 'a' with 'b c' in Cat. I can explain this more if you're interested.)

Using intersection types may seem like a dangerous things to do, but because functionals in 5th are restricted to only manipulating functions, the type system stays decidable and the types stay small.

I'll give one more example, this one more unfortunate than the last. Here is the function 'bi@' in Cat that, given a function, calls it on the top two elements of the stack (e.g. '3 4 [sq] bi@' == '9 16'):

   define bi@ { dup [ dip ] dip apply }

This gets the following type in Cat0:

   bi@ : A b b [A b -> A] -> A

Now this is a problem. All we can do with a function of this type are things like '[pop] bi@'. Our original example, namely '[sq] bi@', will be rejected by the type system.

In Cat0 however (not Cat), we can give a type signature to restrict the type as such:

   bi@ : A b b [0 b -> 0 c] -> A c c

This is a more useful type. It lets us do things like '[sq] bi@' as we originally intended. However, we can no longer do '[pop] bi@'. Hence we have a problem as neither type signature is more general than the other. Accordingly, you'd need both versions (and actually several more).

Here's the same function in 5th (where '_F' is shorthand for 'dip[F]'):

   bi@[F] = _F F

It receives the following type:

   bi@[A b -> C /\ C d -> E] : A b d -> E

This type, again, is much more flexible. Not only can we do 'bi@[pop]', but we can also do 'bi@[sq]', 'bi@[dup]', 'bi@[swap]', et cetera.

Another issue I wanted to tackle is that of effect types. The notation necessary for effect types can be significantly simpler in 5th than it could be in Cat0 (where you'd need a bit more type noise due to the higher order functions). Specifically, type signatures that capture the full effect information of a function or functional can be given in 5th without the use of type variables. Unfortunately, I'm running out of steam at this point, so I'll lazily ask that you take my word for it.

Alright, so that's the gist of it. There are several other reasons I am pursuing this approach aside from the type issues, including implementation issues (it is easy to eliminate the need for a stack in 5th -- not so in Cat), ease of reasoning, ease of programming (shuffling functions around on the stack can be tedious -- compare the definitions of 'bi@' above), et cetera. I should also note that first class functions are indeed possible in 5th. Only first class functionals are disallowed.

Finite-rank intersection types

OK, I see where you are coming from. I don't yet see that the problems with typing twice mean we need a whole new type system, but I'm open to the idea that intersection typing might be a better fit for typed concatenative languages than HM-style.

Maybe restricted intersection typing could allow you to combine decidability with a more interesting class of higher-order functions. Are you aware of finite-rank intersection types, eg. Trevor Jim's Rank 2 Type Systems and Recursive Definitions?

Yes I am aware of Jim's

Yes I am aware of Jim's work. You are right that a more flexible system than what I'm proposing could be employed. I've done a bit of work applying Daan Leijen's HML to concatenative languages, and I think that is likely a better approach for a higher order concatenative language than intersection types. The number of cases where you really need higher order types or intersection types is rather rare anyway.

My desire to go with the function/functional distinction is as much due to non-type-related concerns as it is type-related concerns. In Backus's FP for instance, there was no type system at all, yet an advantage was still seen to going with a limited set of combining forms. 5th allows definition of new combining forms (unlike FP), but due to the way functionals are handled, such new forms can be translated to existing forms simply via textual substitution. For example, 'map[F] = _null fold[swap _F cons]' is defined strictly in terms of existing forms, and 'map[foo]' is equivalent to '_null fold[swap _foo cons]'.

Compilation is also a big issue. For example, 'square' has the type 'Int -> Int' and 'swap' has the type 'a b -> b a', but they can be unified to give the type 'Int Int -> Int Int'. As such, you can't look at a type at compile time and know how many arguments to pass to a function. Accordingly, you still need some sort of stack or some kind of function tagging to know how many variables to pass to a function at runtime. 5th's approach avoids this issue by ensuring the exact function passed to a functional is always statically knowable.

Finally, I'm also interested in the function/functional approach because of how it can be mapped to a dataflow representation. This gets much more difficult (or impossible) to do in a way that makes visual sense in the presence of unrestricted higher order functions.

Properties of type systems

Compilation is also a big issue. For example, 'square' has the type 'Int -> Int' and 'swap' has the type 'a b -> b a', but they can be unified to give the type 'Int Int -> Int Int'. As such, you can't look at a type at compile time and know how many arguments to pass to a function. Accordingly, you still need some sort of stack or some kind of function tagging to know how many variables to pass to a function at runtime.

This sounds like principal typing, or some appropriate proxy for it, will be an important property for your type system to have. Trevor Jim talks about that in the article I linked to. It might be rewarding to take a closer look at it.

Understanding stack manipulations

Normally, purity gives you easy-to-follow dataflow, which allows you to reason about small chunks of code in isolation. This makes code easier to understand and modify. Twisting the definition of your language to implicitly thread through all kinds of state may allow you to call your language "pure", but you've lost the clear dataflow and local reasoning.

I agree with this. Cat's type system, it seems to me, makes the claim of purity easier to make for it than for Joy, against this criterion, since it is statically determinable what is happening to the stack.

Joy is mostly functional

Joy is functional in the sense that all data is persistent. Even the stack itself is persistent; the current stack can be placed on the stack with the 'stack' function, and conversely, a stack on top of the stack can be made to be the "main" stack with the 'unstack' function. This functionality is used quite often. For example, the 'ifte' function saves the stack before calling the conditional and restores it before calling the appropriate branch, therefore hiding any changing of the stack the conditional may have done.

One reason Joy would *not* be functional is that there's no way to separate IO from the rest of the program as it has no type system. The other reason it might not be functional is that any function can access the entire stack (and therefore the entire state of the computation). Again, a type system would help here. I'm currently working on a language with a type and effect system to address both of these issues.