Clojure: Trampoline for mutual recursion

Lots of buzz surrounding Clojure these days. The main knock on Clojure has been the lack of tail call optimization. Because of the current lack of TCO on the JVM, Clojure currently doesn't support automated TCO - though the plan is that if the JVM ever adds the capability, Clojure will implement it at that point. The recur syntax is a way to mimic TCO for self-recursion. Now the trampoline for mutual recursion is available for more involved forms of recursion:

I've added trampoline to ease the conversion/creation of mutually
recursive algorithms in Clojure.

Trampolines are a well known technique for implementing TCO - instead
of actually making a call in the tail, a function returns the function
to be called, and a controlling loop calls it, thus allowing chained
execution without stack growth.

As an aside, I have been tinkering with the language and it seems well thought out. For a newbie like myself, the main difference maker is the promotion of common sequences (lists, vectors, maps, sets) to built-in first class collections making it comparable in ease of use to other scripting languages.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

As a further aside

I translated the first chapter of SICP in Clojure - fairly trivial given the closeness of Scheme and Clojure. I do need to fix the recursion in several of the places to make it TCO (mostly using recur). (Which is one of the reasons the above announcement caught my attention).

Support?

Real instruction sets (or C) don't "support" proper tail recursion either. There's, what, 20 years worth of literature on how to solve this problem (and countless implementations thereof)?

That there is "buzz" around a non-TCO LISP dialect in 2008 is utterly incomprehensible to me. I'm all for a modern/practical LISP, but come on...

Is there something specific about the JVM that makes it impossible, or is it just a naive implementation?

Real instruction sets

Native instruction sets often let you do whatever you want with the stack. C doesn't in the ANSI standard, but you can do it with a bit of assembly. .NET IL has an explicit instruction for tail calls. The JVM, on the other hand, is very strict about how you use its stack and has no tail call instruction.

There are full TCO implementations in a couple of Scheme's for the JVM. Kawa uses trampolining and SISC uses a heap based custom stack. Either solution has performance implications as well as implications regarding Java interoperability. Java interop is a major design goal for Clojure.

Hickey's choice to make the programmer indicate when the trade-off is desirable is a pragmatic workaround. There's been some talk in the Scala mailing list about offering trampolining in the standard library - I wouldn't be surprised to see it in 2.8.0.

In the meantime we can all keep hoping for the day we see a tail call instruction on the JVM.

CL

Common Lisp has had a lot of buzz in recent years. Tail-call optimization is usually not available by default in CL. Programmers use control-flow constructs like LOOP, DOLIST, MAP, etc instead of direct recursion.

TCO is dramatically more important in Scheme than in most other languages.

Personally I'd prefer not to have automatic tail-call optimization performed in most of the languages that I use.

Interoperability

Ah, I didn't consider Java interop (trampolining is what I was getting at, you can do that most anywhere).

It seems you could work around the interop issues with sufficiently clever wrapping, but the performance hit of that /and/ trampolining would be a bit much. I suppose you can have JInterop or TCO, but not both.

Personally I'd still pick TCO :)

Yeah...

Personally, I would have picked TCO, too. And then, I would have regretted it. Fortunately, Rich Hickey does not make such mistakes.

JVM will eventually get TCO capability at which time adding it to Clojure will be piece of cake. Loosing a great deal of Java interoperability to have fully-blown TCO immediately is simply not the right choice. Especially since there *are* workarounds (recur and trampolines).

I personally consider

I personally consider trampolining a hack to get around a deficiency in the implementation. Having programmed in high-level languages for the last almost 20yrs which have had TCO I find that my code often just assumes this. Having recur or trampolining just takes care of the more explicit looping.

I understand the reason for not having TCO but from my point of view it severely limits the language.

Explicit TCO?

Hi Robert!

Out of curiosity - What do you think about having an explicit tailcall keyword in Erlang? I mean:

foreach(F, [])    -> [];
foreach(F, [H|T]) -> F(H), tailcall foreach(F, T).

This would help in a couple of scenarios that I've wrestled with and seen other people wrestle with, especially newbies. Firstly the compiler can check that you really are making a tailcall -- i.e. that tailcall is in tail-position. Secondly it prevents the compiler from dropping stack frames that may be inconsequential to performance but useful to see in backtraces for debugging i.e. the very many function calls that "just happen to be" tail calls.

The latter example is very important to me, and it always bugs me that TCO fans are dismissive of it. I've worked on systems where smart non-Erlang-hacker technicians had lots of unnecessary trouble with relating debug information (crash reports) with the source code due to unnecessarily missing frames in backtraces. I had the opportunity at Lisp50 to grill Will Clinger for the Schemers line on this and found it very unsatisfying: essentially that a Sufficiently Smart Compiler could selectively eliminate stack frames that probably aren't interesting in backtraces.

Explicit tailcall statement

This is the approach just adopted in Tcl 8.6: an explicit [tailcall] command that does the trampoline (using Tcl's new non-recursive evaluation engine, courtesy of Miguel Sofer). IIRC, it doesn't check if the call is actually in tail-position, but instead acts more like [return]: any tailcall command will be in tail-position, because the scope will exit when it is called regardless. It's not the most elegant solution, but it does the job. If you want implicit tailcalls then you can always write your own [proc] replacement that inserts the calls for you (not trivial for Tcl, but just about feasible, I think).

The problem with explicit TCO

Hi Luke,

The problem with a tailcall keyword is that it would do exactly what you are trying to avoid. You *want* to drop all stack frames, even saving only some of them will mean that in the long-run you will overflow the stack.

Think functions which never return.

Not a problem

A tail-call keyword can be pure annotation - a request for the compiler/interpreter to check that the following call appears in tail position and if so generate the appropriate structures. I'm a bit iffy on whether I like the idea myself, but it certainly doesn't need to cause problems.

An alternative:
Compiler automatically does tail calls where it can except when user explicitly puts in a "no-tail-call" annotation.

User can add an explicit "tail-call" annotation in which case the compiler issues an error if the call is not in tail position.

It's invalid to have both annotations on the same call.

Backtraces

Here's an example of a tail-call that I don't want to drop, in foo/1:

foo(N) -> bar(N), baz(N).
bar(N) -> ... .
baz(N) -> 1/N.

If I call foo(0) then I want a correct backtrace:

Divide-by-zero in: bar/1 - foo/1 - ... - shell

but if TCO were performed then for no significant runtime benefit I'd see:

Divide-by-zero in: bar/1 - ... - shell

which isn't true and may confuse me about which code-path my program has taken.

I as a programmer know which tail-calls I'd like to have optimized and which I wouldn't and I'd probably be quite happy to tell the compiler which is which using a tailcall keyword.

You may not agree but I hope you see what I mean now :-) I want to see complete backtraces.

"Complete" backtraces...

"Complete" backtraces are even beyond what you receive today. When an exception is called, you typically don't have information indicating every method call ultimately leading to it. You only have the current call stack, which is really part of the current continuation... i.e. where the call is going next rather than where it has been before.

If you want complete backtraces, or even a time-limited trace history, it might be worth implementing it in the debugger almost independently of the current continuation. Even in today's languages it would often be nice to know the last hundred or so calls/return values/etc. leading up to an exception.

In any case, I don't believe you should allow a goal for 'history' information limit the implementation on how your program moves from present to future states. People who keep histories do it by writing down stuff happening in the present. A debugger can be tooled to do the same.

Chicken Scheme does exactly that

Chicken Scheme's traceback report is simply the last N procedures called. Chicken does "trampolining in bulk", aka "Cheney on the MTA"; the code is transformed by CPS, and the CPS procedures, translated to C, simply call each other until the stack is full. Then live data is evicted to the heap, the stack is reset with longjmp(), and on we go.

So what is at C level a backtrace represents at Scheme level the last N procedures called, though N may be truncated if by bad luck the last stack reset has happened recently.

You and everybody else

I personally consider trampolining a hack

In personal communications between myself and Rich Hickey and in mailing list posts by Martin Odersky, both have said that TCO is their #1 first desired improvement in the JVM. Believe me, both are pained by the issue.

Having recur or trampolining just takes care of the more explicit looping.

Actually trampolining takes care of all tail calls. The inelegant part of the hack in Clojure (and perhaps soon in Scala) is in making the programmer specify when he/she wants it. Trampolining itself is a perfectly acceptable implementation technique, and I bet you can find a few Scheme-to-C compilers that use it.

Me and everybody else

I never assumed that Rich Hickey *liked* the situation with no TCO, and recur and trampoline are ways of trying to get around the lack of it in the underlying machine.

Yes, you can use trampoline to handle all tail calls, but it is not always obvious what you are doing and it requires adding something somewhere else to handle it. Sometimes I will call the last function and sometimes I will return it and I have to know why it is done in a certain way. Or more important the poor reader of my code will have to find out.

I have 2 related questions:

- Why is it called *tail* call and not *last* call? I learnt of it as last call, tail call to me implies recursion.

- Why has not CL adopted it in its spec?

Some confusion

Some confusion: trampolining is an implementation technique. You need not ever see it. Clojure makes it optional and visible, but it can be invisible and automatic as well.

Tail call just means a call in tail position. It need not be recursive. Why not call it "last"? I have no idea.

It's traditional ...

... but the Prolog community, which apparently came up with the idea independently, does say "last-call optimization" or "LCO".