Can local variables assignments be considered pure functional without monads?

Okay, this might seem like an odd question, but is there contention on whether or not a while loop that performs assignment on local function variables is pure functional? For example the following Java code:

int process(int[] xs) 
{
  int zeros = 0;
  int sum = 0;
  int i=0;
  while (i < xs.Size) {
    if (xs[i] == 0)
      zeros++;
    sum += xs[i];
    ++i;
  }
  return sum - zeros;
}

Now I have heard frequently that the above code is considered to have side-effects because of the local assignments to variables. However, the mapping to pure-functional stack-based code (here shown in Cat) is very straightforward:

define process {
  // top argument = xs
  0 // zeros
  0 // sum
  0 // i
  [
    dig3 // bring xs to top of local stack
    dupd swap get_at eqz // xs[i] == 0
      [[inc] dip3] // inc zeros
      []
    if
    dupd swap get_at // tmp = xs[i]
    [bury3] dip // put xs to bottom of local stack
    swap [add_int] dip // sum += tmp
    inc // ++i
  ]
  [
    dig3 count [bury3] dip // tmp = xs.Size
    dupd lt_int // i < tmp
  ]
  while
  pop // remove i
  swap sub_int // sum-zeros
}

So, my problem is that I am not sure whether this is old news or new news. Any feedback would be appreciated.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

It's old news

Just like your stack language represents state with a stack, reference types can be translated to a "pure functional" context by passing a bag of address -> value mappings around (and returning them from functions that "update state" -- ala monads).

That doesn't really buy you much. It's still very awkward to work with, and one of the most useful properties of referential transparency (the ability to apply algebraic reasoning to your programs) is lost for most programs.

the most useful properties

the most useful properties of referential transparency (the ability to apply algebraic reasoning to your programs) is lost for most programs

This isn't actually the case. We have complete referential transparency for every sub-expression in the stack-based code I showed. It's just a question of understanding what the types are.

I think you misread my comment

I didn't say that your program isn't "referentially transparent."

So where is the loss of referential transparency?

You refer to a loss of referential transparency due to the technique I used. Please elaborate.

What I said was ...

... "one of the most useful properties of referential transparency (the ability to apply algebraic reasoning to your programs) is lost for most programs"

In other words, obviously you can simulate a Turing machine in lambda calculus (Church-Turing!), but that doesn't mean that the lambda calculus encoding of a Turing machine is the best place to prove that e.g. a function is distributive w.r.t. another (a*(b+c) = a*b + a*c).

Old news, basically...

There was some discussion about this in a recent long and monstrous thread. Start here. Anton's immediate reply is good, but then it goes quickly off the rails.

Anyway, yes, it's possible to consider a function like that purely functional, when viewed from the outside. (Of course, the implementation is not purely functional.) This is exactly what Haskell's State monad lets us do: we can write a pure function that internally introduces and uses a local mutable state.

The implementation is purely functional

Of course, the implementation is not purely functional.

Actually it is. Every instruction is a pure tail polymorphic function with a well-defined type. The implementation I showed is a composition of a sequence of functions.

Confusion

Sorry, I was referring to the original Java code, or to the Haskell version using State. I would not make any claims about the Cat code. I haven't spend much time with stack languages, and it's not clear to me what the right notion of RT is in that case.

I will venture a guess... The final result of your composition is a function from stack to stack, right? The meaningful notion of RT needs to somehow characterize the relationship between the input and output stacks. I know you've done work in this area, but I'm not familiar with it. How would you capture that some functions are RT and some are not?

Labeling effects.

Sorry about the confusion as well.

I will venture a guess... The final result of your composition is a function from stack to stack, right?

Yes, every function is a function from stack to stack. You could equivalently define Cat primitives as functions from list to list in Haskell.

The meaningful notion of RT needs to somehow characterize the relationship between the input and output stacks. I know you've done work in this area, but I'm not familiar with it. How would you capture that some functions are RT and some are not?

I take a very simple approach of that any function (including the implicit function of composition implied by whitespace) that takes as argument an impure function, is itself automatically promoted to an impure function. Otherwise it remains pure. In Cat I represent impure functions using "~>" and pure functions using "->".

Some examples of the types of functions in the original examples (making row variables explicit)

add_int : ('R int int -> 'R int)
swap : ('R 'a 'b -> 'R 'b 'a)
dupd : ('R 'a 'b -> 'R 'a 'a 'b)
[swap] : ('R -> 'R ('S 'a 'b -> 'S 'b 'a)) 
dip : ('R 'a ('R -> 'S) -> 'S 'a)
while : ('R ('R -> 'R) ('R -> 'R bool) -> 'R)

An impure primitive looks like:

writeln : ('R any ~> 'R)

This means that "define f { 5 writeln }" is impure because writeln is impure and the composition of "5" and "writeln" is impure. So if the "while" function recieved an impure argument, it would be promoted to an impure function. In the example I gave, every every primitive is pure: it only transforms the stack in a predictable and reproducible manner.

[Note: throughout I still use the term "pure" to mean "referentially transparent without regards to termination".]

Ok

Without understanding all the details of your system, I do think I see the big picture. In the absence of any notion of a store other than the stack, there are no side effects involving mutable state. Any store "effects" that escape the function will be encoded in the type of the function. So I think we understand each other, and yes, in a scheme like this, I would say it is perfectly reasonable to call functions using local variables "pure".

I'd probably plump for "not

I'd probably plump for "not news". You don't need address-value pairs or monads, just encode assignment as argument passing:

process :: [Int] -> Int
process = loop 0 0 0
    where
        loop zeros sum i xs =
            if i < length xs then
                loop
                    (if xs !! i == 0 then zeros + 1 else zeros)
                    (sum + xs !! i)
                    (i + 1)
                    xs
            else
                sum - zeros

There a paper on this called...Lambda the...hmm.

Still need monads to analyze

Although it opens up maybe a more esoteric debate, general recursion is impure and some argue it ought to be restricted to a monad as well (e.g.: the Epigram papers).

General recursion is impure

"general recursion is impure"

I had never heard that before. Where can I find evidence of this? Is there a particular paper that demonstrates this, or is there a classic example you can show?

See here for a brief

See here for a brief discussion. Here are some slides discussing the issue from that workshop. Basically, non-termination/partiality is an effect. General recursion can lead to non-termination, so it too may be impure.

Thanks a lot for the links.

Thanks a lot for the links. That sounds like a contentious claim, is there wide consensus in the academic community nowadays? I believe I can find numerous examples in the literature that do not acknowledge non-termination as an effect. This leads me to believe that the precise definition 'effects' is somewhat situation dependent (e.g. do I care about termination of code or not).

It is well-known that

It is well-known that general recursion can result in non-termination, and that a potentially non-terminating function is a partial function. See the thread on Total Functional Programming (the paper is quite good too). As to whether partiality being an effect is generally accepted, I can't say. It's certainly an interesting idea though.

While is goto with assignment == let

Yes, this example shows that the code is in fundamentally referentially transparent. So once this code gets compiled to assembly code, assuming tail-call optimization in the compiler, we will end up with something remarkably similar to the stack-based code I showed.

I guess then the only think I can point to as being possibly noteworthy is the observation that we don't have to generate recursive functions with named arguments in order to reason about the code functionally, we can go directly to a point-free functional stack-based representation and reason about it formally from there.

[sidebar: I think it worth noting for those newcomers reading along that the paper to which Roly referred to is Lambda the Ultimate Imperative]

Yes, they can.

It's obvious (to me) that any function where assignment is limited to local variables is referentially transparent (it was me that started this topic in the "other" thread). Since the function does not update any external variables, it can be replaced with its results, and therefore it is referentially transparent.

Of course the supporters of pure functional programming will now jump at me, claiming I'm all wrong, and that input variables are not considered output as well, yada yada yada :-)

What about the implicit parameters?

In one sense, Cat functions implicitly take the entire stack as an input parameter yield a new stack as a return value. This threading of the stack through every function call gives you less freedom to "move things around" (one of the benefits of referential transparency).

In Java, you could take the stance that every operation is a pure function of its inputs, but that the entire state of the world is threaded through each operation. But now you have even less freedom to move things around.

In your example Java code, since only local variables are manipulated, you could say that instead of the entire state of the world being threaded through, only the function's current activation record is. This means that no implicit state needs to be threaded through calls to that function, just like in Haskell.

And by "move things around", I'm referring a human's ability to rearrange code without having to think too hard -- something Haskell is good at. I feel like it's not as easy for a human to do the same in Cat, though Cat's type system does capture information about the stack, which gives the compiler a decent amount of information with which to perform optimizations.

In some sense, this makes Cat better than Java because if you wrote a Java program that used a java.util.Stack for passing parameters, the compiler would have a hard time figuring out what was going on. On the other hand, you don't do that in Java; you use normal parameter passing, which the compiler can track just fine (this is why I'm not really a fan of stack languages). But I imagine there are cases where Cat's stack type system allows you to perform something more directly/efficiently than you could with normal parameter passing.

This threading of the stack

This threading of the stack through every function call gives you less freedom to "move things around" ... I'm referring a human's ability to rearrange code without having to think too hard

I disagree quite vigorously, stack-based higher-order languages are very easy to move around (though I do agree Haskell is good at it as well). I'd point you to the MetaCat rewriting system, and one of Manfred von Thun's essays about Joy for some discussion. An extended set of sample MetaCat rewriting rules can be found at: http://cat-language.googlecode.com/svn/trunk/cat/metacat-macros.cat.

In some sense, this makes Cat better than Java because if you wrote a Java program that used a java.util.Stack for passing parameters, the compiler would have a hard time figuring out what was going on. On the other hand, you don't do that in Java; you use normal parameter passing, which the compiler can track just fine (this is why I'm not really a fan of stack languages). But I imagine there are cases where Cat's stack type system allows you to perform something more directly/efficiently than you could with normal parameter passing.

I'd like to point out here that the so-called "normal parameter passing" in Java is in fact done on a stack in the virtual machine. Personally I am unable to see a semantic difference between the named parameter approach of Java and the unnamed approach of a stack language, except that in the Java virtual machine you can't return multiple results, or construct lambda expressions on the stack.

An example

I disagree quite vigorously, stack-based higher-order languages are very easy to move around (though I do agree Haskell is good at it as well). [...]

I've never really used stack languages, so I have no intuition about programming in them, but it seems like the implicit stack makes things a little harder on the human. For example:

0 1 2 3 4 run

If I wanted to move the invocation of run somewhere, I'd have to know how many arguments run takes (say, 3) and then make sure I move those arguments as well. In Haskell, the invocation would be a grammatically-delimited expression, which is trivial to visually identify and extract. Here's a slightly more complicated example:

0 1 2 jump
  3 4 run

If I wanted to move the invocation of run down, it seems tricky. The Haskell code for this might look like:

(a, b, c) = jump 1 2
(d, e)    = run  c 3 4

It seems easier for a human to identify code rearrangement possibilities in the Haskell code.

But like I said, I've never really used a stack language. Maybe these aren't good examples. Maybe I just haven't yet gotten used to keeping track of the stack in my head. Also, the stack code is less verbose. Maybe there's a way to sacrifice some conciseness in order to make it as easy-to-rearrange as Haskell?

I'd like to point out here that the so-called "normal parameter passing" in Java is in fact done on a stack in the virtual machine. Personally I am unable to see a semantic difference between the named parameter approach of Java and the unnamed approach of a stack language, except that in the Java virtual machine you can't return multiple results, or construct lambda expressions on the stack.

I'm sure Cat's stack types can handle everything Java's parameter passing can. What I was trying to say is that even though the Java language doesn't have any real stack typing system, it's not a horrible situation because parameter passing covers the common cases. The same goes for Haskell, which doesn't have a special way to type stacks either.

But what I was actually more interested in is what makes something purely functional. (Just to be explicit about this, I agree that process is purely functional from the external perspective. It's the implementation of process that I'm talking about here.)

If we took Java and said "every operation implicitly takes the state of the world as an input and returns a new state of the world as an output", would that make Java purely functional? That doesn't sound right, which is why I tried thinking about the question in terms of implicit state and ease-of-rearranging, but maybe these aren't good measures either.

Moving Expressions

If I wanted to move the invocation of run somewhere, I'd have to know how many arguments run takes and then make sure I move those arguments as well.

First we know that given the same stack configuration (values and types) that "run" would always return the same value if it is pure (i.e. it output depends only on the input). However, as you hint at , is not particularly useful. However, given that we do know that "4 run" can be moved as a single unit (since it is the composition of two pure functions) anywhere and retain its referential transparency (RT). We also know that "3 4 run" et. al. are also RT. This is all because the composition of RT functions are RT.

Haskell, the invocation would be a grammatically-delimited expression, which is trivial to visually identify and extract.

It is not neccessarily grammatically identified because functions are curried. So given examples like:

  f g 1 2 3

We are not sure if g takes one or two or three argument (well technically we know it takes one, but the same rationale can be applied to Cat code). Gnerally speaking though we don't really care. In Haskell "f 1" is referentially transparent, and in Cat "1 f" is referentially transparent (if and only if "f" is a pure function, since Cat supports impure functions).

It seems easier for a human to identify code rearrangement possibilities in the Haskell code

In some-case it can be. However, looking at a code snippet out of context with free variables it is not so cut and dry:

  ... f 1 ...

Can I move the "f 1" literally wherever I want? It depends on what the lexical environment is. While this is not generally a problem for humans, it does requires extra work for automated refactorization.

But what I was actually more interested in is what makes something purely functional. ... If we took Java and said "every operation implicitly takes the state of the world as an input and returns a new state of the world as an output", would that make Java purely functional?

Not in any meaningful or useful way, because you wouldn't be able to move anything around and guarantee preservation of the semantics of the program, which is the whole advantage of RT. If the input is the "whole universe", then we will never be able to guarantee reproduction of it and thus never be able to move it. This also applies if the input has to be the "whole stack". Note that the Cat type-system allows us to always safely identify a finite portion of the top of the stack that is of interest at a key point, and assure that the rest of the stack is unmodified (and unexamined which is a key point).

Renaming and variable capture

Every destructive update of a variable can be modelled as an alpha-conversion (i.e. a renaming); with the caveat that mutations which happen in one half of a conditional have to then be accounted for in the other half.

I.e.

a = 5;
if (condition)
a = 7;
print "a is ", a;

can be rewritten as

const a = 5;
const a2 = condition ? a : 7;
print "a is ", a2, " and was ", a;

This is an old compiler trick.

The fly in the ointment is, of course, variable capture. A local variable which is not subject to capture is essentially a linear type--it has the property that only one thing (the current stack frame) refers to it, so destructive update can be performed as an optimization. A variable which is subject to (persistent) capture and held in a closure, OTOH, cannot be treated in this fashion.

ST and SSA

That locally imperative code can be used to implement a pure function is of course well known; a pointed example being Haskell's ST monad.

That imperative programs can be converted to purely functional ones is also well known. The Lambda the Ultimate papers cover this as one replier has mentioned and as Scott alludes to many imperative compilers use a form called Static Single Assignment that essentially is purely functional.

To answer your question, process as a whole is purely functional (for a sequential system at least), but it's implementation is imperative. You have all the drawbacks and benefits of state inside process. Things like the order of operations being important and obscured parallelism. The benefits of side-effects don't really show up in small blocks of code, but then the costs are also greatly attenuated in small blocks of code.

Order unimportant

... process as a whole is purely functional (for a sequential system at least), but it's implementation is imperative. You have all the drawbacks and benefits of state inside process. Things like the order of operations being important and obscured parallelism.

The order of operations is not important: all that is important is that the outcome (and the types) are respected. For example:

dig3 count [bury3] dip

can be easily replaced with:

[[[count] dip swap] dip swap] dip

The point I am trying to make is that every single sub-expression in the Cat code is purely functional. Every instruction is a tail-polymorphic function mapping from a stack to a stack (only modifying and examining the top of it).

When you place two instructions side-by-side in Cat the result is the same as composing the two functions, and if both are pure, then the result is pure. This continues on and on to sequences of three and more instructions. No matter how you slice the code, it is the composition of a sequence of pure functions.

For example consider the types of "dig3 dupd"

dig3 : ('R 'a 'b 'c 'd -> 'R 'a 'b 'c 'd 'a)
dupd : ('R 'a 'b -> 'R 'a 'a 'b) 
dig3 dupd : ('R 'a 'b 'c 'd -> 'R 'a 'b 'c 'd 'd 'a)

In the code "dig3" takes any stack with at least four items on the top, and rewrites it to a new stack, without affecting the resting of the stack below the top four items. The rest is hopefully evident.

I'm talking about the Java.

I'm talking about the Java. The Cat is purely functional, unless I misguessing the semantics, and as such does benefit from that.

Oh, whoops

Sorry for misunderstanding. If you want to know the semantics of Cat, the are summarized in this Scheme evaluator:

(define (cat-eval stk exp)
  (if (null? exp)
      stk
      (cat-eval
       (case (car exp)
	 ((true)   (cons #t stk))
	 ((false)  (cons #f stk))	 
	 ((apply)  (cat-eval (cdr stk) (car stk)))
	 ((papply) (cons (cons (cadr stk) (car stk)) (cddr stk)))
	 ((pop)    (cdr stk))
	 ((dup)    (cons (car stk) stk))
	 ((swap)   (cons (cadr stk) (cons (car stk) (cddr stk))))
	 ((dip)    (cons (cadr stk) (cat-eval (cddr stk) (car stk))))
	 ((add)    (cons (+ (cadr stk) (car stk)) (cddr stk)))
	 ((eq)     (cons (= (cadr stk) (car stk)) (cddr stk)))
	 ((empty)  (cons (null? (car stk)) stk))
         ((compose)(cons (append (car stk) (cadr stk)) (cddr stk)))
	 ((cons)   (cons (list (cadr stk) (car stk)) (cddr stk)))
	 ((uncons) (cat-eval (cdr stk) (car stk)))
	 ((if)     (if (caddr stk) 
		       (cat-eval (cdddr stk) (cadr stk)) 
		       (cat-eval (cdddr stk) (car stk))))
         ;; Treat everything else as a literal
	 (else     (cons (car exp) stk)))
       (cdr exp))))

Most everything else can be derived or extrapolated from those primitives.

Rewriting "process" using fold

Since we were talking about referential transparency and rewriting of stack langauges, I kind of wanted to throw the following equivalent program at people:

define process { 
  0 swap 0 [[[inc] dip2] ifeqz +] fold swap sub_int 
}

What [I think] is interesting here is that fold in Cat accepts a row-polymorphic function as argument, meaning that it can accept any function that takes N + 1 arguments and returns N results (where N >= 1). In other words the type of fold is

  fold : ('A list 'b ('A 'b 'c -> 'A 'b) -> 'A 'b)

So this fold has the same effect as local assignment in a pure manner. The compilation of this fold function as an assembly code loop should be fairly obvious for compiler writers.

I am curious about how to go about doing the same thing in Haskell. I know that we can do it by representing local variables as a list. Is there an better method?

folding with tuples

So this fold has the same effect as local assignment in a pure manner.

That's a basic feature of folds.

I am curious about how to go about doing the same thing in Haskell. I know that we can do it by representing local variables as a list. Is there an better method?

To handle N arguments and results in a language like Haskell, a tuple is typically used. For more on this, see the Universality of fold tutorial starting at section 4, "Increasing the power of fold: generating tuples".

Here's a Haskell version of the function, using a fold with tuples:

process xs = sum - zeros
  where (sum, zeros) = foldl op (0,0) xs
        op (sum, zeros) 0 = (sum, zeros+1)
        op (sum, zeros) x = (sum+x, zeros)

Here's a one-liner version for completeness, although it stops short of attempting to eliminate all meaningful local names:

process = uncurry (-) . (foldl (\(sum, zeros) x -> (sum+x,zeros+(if x == 0 then 1 else 0))) (0, 0))

Also

There's also "foldM", which might be relevant, depending on what exactly you mean.

Thanks for the example

Thanks for the example Anton.

I think this is cool beacause it suggests correspondance between stack-based functional languages, and pure applicative functional languages. Essentially a stack-based language just passes the stack as a tuple from function to function, and composes all the functions. The only difference AFAICT is that Cat functions are all row-polymorphic making it easy to compose functions that represent the different imperative statements.

You might want to take a

You might want to take a look at the way the Arrow sugar builds an 'environment' in Haskell - a generalisation in terms of something a little more pleasant than products could be rather fun.

Not so fast

Essentially a stack-based language just passes the stack as a tuple from function to function

While it's possible to handwave enough to make this true, if you were to actually express that correspondence as a semantics, taking types into account, you'd find some devils in the details.

A particular issue is what an individual function knows about the types of the arguments it's expecting vs. the type of the rest of the stack.

I notice that the Cat Scheme evaluator posted in this thread ignores such issues, because its stack is untyped. It also doesn't distinguish between the types of terms and values.

Writing an ML or Haskell version of that evaluator would help to highlight these issues. (You could also do it in Scheme, but you'd have to supply the type discipline yourself. Doing it in a typed language will preclude unintended cheating.)

Here you go

Well rather than writing a full evaluator in Haskell, here is at least an embedding:

lit’ :: t -> a -> (t, a)
lit’ n s = (s, n)
pop’ :: (t, a) -> (t)
pop’ (s, x) = s
dup’ :: (t, a) -> ((t, a), a)
dup’ (s, x) = ((s, x), x)
swap’ :: ((t, a), b) -> ((t, b), a)
swap’ ((s, x), y) = ((s, y), x)
add’ :: (Num u) => ((t, u), u) = (t, u)
add’ ((s, x), y) = (s, x + y)
apply’ :: (t, t -> u) -> u)
apply’ (s, f) = f s
quote’ :: (t, a) -> (t, forall u . u -> (u, a))
quote’ (s, x) = (s, \s -> (s, x))
compose’ :: ((t, (u -> v)), (v -> w)) -> (t, (u -> w)
compose’ ((s, f), g) = (s, g . f)

There is a technique described in Okasaki 02 for embedding postfix languages in Haskell which can be used.

I'll have to look into writing an evaluator, but I suck at Haskell. Perhaps some kind soul will jump in.

A particular issue is what an individual function knows about the types of the arguments it's expecting vs. the type of the rest of the stack.

I believe the rest of the stack can be expressed using a type variable with inner forall polymorphism (i.e. a row variable in each function), if that is what you are getting at.

Kevin Millikin's Cat Evaluator in ML

I forgot that Kevin has already contributed the outline of an evaluator in ML for Cat.

Back to unityped lists

In Kevin's implementation, a stack is a list of universally-typed values, not a tuple.

Converting such code to implement the stack as a (typed) tuple would, I believe, require an excursion deep into the land of fancy type extensions, particularly if you implement a reasonably complete core with quote etc. (Happy to be proved wrong.) This is really the point I'm making. I think it raises some questions about the correspondence you suggested.

I'm not sure that Okasaki's approach changes the situation that much, but I haven't looked at it in detail yet. The "practical concerns" mentioned in the conclusion, and use of GHC type extensions, are indicative of the sorts of issues I was thinking of.

By contrast, an evaluator like Kevin's can be written without any use of higher-order functions or polymorphic types, i.e. it's very simple. I think that writing such an evaluator yourself (in SML or Ocaml is fine, no need for Haskell's chrome fins) would be very instructive for you.

Re your Haskell embedding, it has a few trivial errors, but once they're fixed I don't see how to use 'quote', offhand, without getting type errors. Can you give a working sample program for that embedding which quotes a useful function, such as e.g. [dup add], and then uses it? (The Okasaki paper seems to use a more complex approach to procedural abstraction than your embedding does.)

Inner Forall Polymorphism and Equirecursive Types

In Kevin's implementation, a stack is a list of universally-typed values, not a tuple.

Good point. However, I am pretty sure you can't implement Cat directly in ML in a type-safe way unless you implement your own typing discipline.

Converting such code to implement the stack as a (typed) tuple would, I believe, require an excursion deep into the land of fancy type extensions. This is really the point I'm making. I think it raises some questions about the correspondence you suggested.

I remain fairly convinced, based on a fair amount of study on the subject, that at least inner forall polymorphism and equirecursive types are neccessary to implement a Cat/Joy like language in a type-safe manner.

Anyway my point was simply that there is a path, although perhaps not well travelled, from assignment statements in imperative languages to statically typed functional languages, and that what you were doing in your Haskell example was a step down that path.

I think that writing such an evaluator yourself (in SML or Ocaml is fine, no need for Haskell's chrome fins) would be very instructive for you.

AFAICT all it will demonstrate is the limitations of standard typing disciplines in these languages, which I am already aware of. Besides I think that implementing a type inference algorithm that supports inference of inner forall polymorphism and equirecursive types is more useful.

but once they're fixed I don't see how to use 'quote', offhand, without getting type errors.

I don't really think it is possible. I have tried but ran into problems. It is kind of pointless for me, a non-Haskell expert, to even continue ramming my head against those problems.

I appreciate the discussion because it helps give me a clearer picture of what is well-known and not so well-known in the area of converting imperative code to functional code. It also inspired me to get the type checking and equirecursive types working again in the Cat compiler.

Cross-purposes?

Anyway my point was simply that there is a path, although perhaps not well travelled, from assignment statements in imperative languages to statically typed functional languages, and that what you were doing in your Haskell example was a step down that path.

What I was questioning was your characterization of the "correspondence between stack-based functional languages and pure applicative functional languages" with the stack as a (typed) tuple in the latter.

Regarding assignment statements, there's a well-known and widely exploited correspondence between assignment statements in imperative languages and the use of tuples in functional languages, as demonstrated in my example. That correspondence is why functional programmers are interested in tail recursion - because if you're using tuples to simulate destructive update, you're typically doing tail recursion; and if you don't optimize that, iteration becomes unnecessarily bounded. (Of course the previous statements apply to single values too, but tuples provide a way of dealing with more than one value at a time.) Folds are a fundamental expression of this approach.

To tie up the loose end here, I agree that if there weren't any typing issues with treating the stack as a typed tuple (by which I mean something other than a nested tuple of universally-typed values), then your correspondence would work and the above approach to assignment would apply.

AFAICT all it will demonstrate is the limitations of standard typing disciplines in these languages, which I am already aware of. Besides I think that implementing a type inference algorithm that supports inference of inner forall polymorphism and equirecursive types is more useful.

Part of the reason I suggested that exercise is on seeing the typing shortcuts in your Scheme evaluator, e.g. no distinction between terms and values. The standard typed expression of such an evaluator doesn't push the limits of standard type systems at all, but it is still very useful for highlighting fundamental typing distinctions (not specific to the "limitations of standard typing disciplines") that can easily go unnoticed in an untyped evaluator.

It would also help in communicating about semantics, since these evaluator implementations are essentially standard semantic definitions. Even if you want to restate your claim about limitations to apply to standard semantic modeling techniques, you should at least be reasonably familiar with the techniques you're dismissing.

I don't really think it is possible. I have tried but ran into problems. It is kind of pointless for me, a non-Haskell expert, to even continue ramming my head against those problems.

I agree. That's why I questioned the correspondence you raised. :)