My programming language compiles to C, I want to implement tail recursion optimization. The question here is how to pass control to another function without "returning" from the current function.
It is quite easy if the control is passed to the same function:
void f() {
__begin:
do something here...
goto __begin; // "call" itself
}
As you can see there is no return value and no parameters, those are passed in a separate stack adressed by a global variable.
Another option is to use inline assembly:
#ifdef __clang__
#define tail_call(func_name) asm("jmp " func_name " + 8");
#else
#define tail_call(func_name) asm("jmp " func_name " + 4");
#endif
void f() {
__begin:
do something here...
tail_call(f); // "call" itself
}
This is similar to goto
but as goto passes control to the first statement in a function, skipping the "entry code" generated by a compiler, jmp
is different, it's argument is a function pointer, and you need to add 4 or 8 bytes to skip the entry code.
The both above will work but only if the callee and the caller use the same amount of stack for local variables which is allocated by the entry code of the callee.
I was thinking to do leave
manually with inline assembly, then replace the return address on the stack, then do a legal
function call like f()
. But my attempts all crashed. You need to modify BP
and SP
somehow.
So again, how to implement this for x64? (Again, assuming functions have no arguments and return void
). Portable way without inline assembly is better, but assembly is accepted. Maybe longjump
can be used?
Maybe you can even push the callee address on the stack, replacing the original return address and just ret
?
Do not try to do this yourself. A good C compiler can perform tail-call elimination in many cases and will do so. In contrast, a hack using inline assembly has a good chance of going wrong in a way that is difficult to debug.
For example, see this snippet on godbolt.org. To duplicate it here:
The C code I used was:
int foo(int n, int o)
{
if (n == 0) return o;
puts("***\n");
return foo(n - 1, o + 1);
}
This compiles to:
.LC0:
.string "***\n"
foo:
test edi, edi
je .L4
push r12
mov r12d, edi
push rbp
mov ebp, esi
push rbx
mov ebx, edi
.L3:
mov edi, OFFSET FLAT:.LC0
call puts
sub ebx, 1
jne .L3
lea eax, [r12+rbp]
pop rbx
pop rbp
pop r12
ret
.L4:
mov eax, esi
ret
Notice that the tail call has been eliminated. The only call
is to puts
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With