Compiling tail-calls to simple C?

I'm writing a Lisp-to-C compiler, trying to keep compiler and runtime very simple and close to C, because I want to embed the language into C applications, and use C libraries.

My closures are structs with a C function pointer and an environment.

Ideally, I'd like the language to be properly tail-recursive like Scheme; however, without resorting to techniques like trampolines, Cheney on the M.T.A., or putting the whole program into one function, it seems that the only possible thing is what BitC does:

Within any function f, calls to f that appear in tail position w.r.t. the body of f must be tail recursive.

[...]

Within a letrec, calls to any function bound in the letrec that appear in tail position within some function bound by the letrec must be tail recursive.

These requirements apply only to function calls whose destination can be statically resolved by the compiler at compile time.

Does anyone have experience with the new tail call optimization of GCC, especially when making an indirect call via a function pointer?

Comment viewing options

when is a tail position call to self not tail recursive?

Within any function f, calls to f that appear in tail position w.r.t. the body of f must be tail recursive.

I don't understand the distinction being made here. I thought that a tail position call to f within f was the definition of tail recursion.

[edit: Nevermind, they are using "tail recursive" to mean "tail call optimized".]

continuations and user level threads

You can't support continuations, coroutines or user level (i.e. non OS) threads with this restriction. You'd still need one of the other schemes for that.

I doubt gcc can/will tail optimize calls via function pointer. I have verified that it optimizes named sibling functions as advertised.

Trampolines

Instead of compiling to direct code, you have a main loop that will call functions, receive return values, and call the next function.

Whatever call stack you have has to be managed by your main loop, not by C itself. I'm afraid there is no easier way.

A web search for "trampoline lisp" should give you further details.

Use llvm-gcc perhaps?

I have not verified this, but instead of using gcc, use llvm-gcc (the gcc front-end merged with the llvm backend). The llvm IR supports tail calls, but I don't know if makes it through the gcc front-end.

no -foptimize-sibling-calls for pointer indirected calls

though undocumented, this is easily tested with a simple test program:

int (*fp)(int);

int test(int x)
{
if( x == 0 )
return 0;
else
return fp(x-1);
}

int main()
{
fp = test;
fp(1000000);
}


consider to compile to a different backend that natively supports tail call optimization. c-- comes to mind.

noreturn calls

An alternative mechanism is a trick used by the chicken scheme compiler.

Basically, you have to CPS transform your program and compile all your calls as noreturn calls (with __attribute__ ((noreturn)) in gcc). There is an entry function that kick starts the computation and keeps a jumpbuf for restarting at the top of the stack. The computation keeps growing the stack and when the stack is exhausted you jump back to the top with the continuation of the computation.

The downside is that your C calling conventions must use continuation passing style.

CPS in gcc

Is there a reason gcc couldn't reuse the current stack frame for calls to functions with __attribute__ ((noreturn)) if -foptimize-sibling-calls is also specified?

If tail call is a goto...

Maybe I am missing something, but since a tail call is a glorified goto, why not translate it into one?

int fac(int n, int a)
{
if (n < 2)
return a;
else
return fac(n - 1, n * a);
}


Into this...

int fac(int n, int a)
{
__tailcall:
if (n < 2)
return a;
else {
int const __arg_n = n - 1;
int const __arg_a = n * a;
n = __arg_n;
a = __arg_a;
goto __tailcall;
}
}


I do not understand why this is not popular instead of the more sophisticated and less portable options.

Edit: Just realised my stupidity. This will not work when the tail call is to a different function.

Nemerle does this

This strategy is actually used by Nemerle for nested function definitions, since the .NET CLR's .tailcall instruction is actually slower than a regular function call.