Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ Tail recursion using 64-bit variables

I have written a simple Fibonacci function as an exercise in C++ (using Visual Studio) to test Tail Recursion and to see how it works.

this is the code:

int fib_tail(int n, int res, int next) {
  if (n == 0) {
    return res;
  }
  return fib_tail(n - 1, next, res + next);
}

int main()
{
  fib_tail(10,0,1); //Tail Recursion works
}

when I compiled using Release mode I saw the optimized assembly using the JMP instruction in spite of a call. So my conclusion was: tail recursion works. See image below:

Tail Recursion

I wanted to do some performance tests by increasing the input variable n in my Fibonacci function. I then opted to change the variable type, used in the function, from int to unsigned long long. Then I passed a big number like: 10e+08

This is now the new function:

typedef  unsigned long long ULONG64;

ULONG64 fib_tail(ULONG64 n, ULONG64 res, ULONG64 next) {
   if (n == 0) {
     return res;
   }
   return fib_tail(n - 1, next, res + next);
}

int main()
{
  fib_tail(10e+9,0,1); //Tail recursion does not work
}

When I ran the code above I got a stack overflow exception, which made me think that tail recursion was not working. I looked at the assembly and in fact I found this:

Tail Recursion doesn't work

As you see now there is a call instruction whereas I was expecting only a simple JMP. I don't understand the reason why using a 8 bytes variable disables tail recursion. Why the compiler doesn't perform an optimization in such case?

like image 899
codingadventures Avatar asked Mar 07 '15 20:03

codingadventures


1 Answers

This is one of those questions that you'd have to ask the guys that do compiler optimisation for MS - there is really no technical reason why ANY return type should prevent tail-recursion from being a jump as such - there may be OTHER reasons such as "the code is too complex to understand" or some such.

clang 3.7 as of a couple of weeks back clearly figures it out:

_Z8fib_tailyyy:                         # @_Z8fib_tailyyy
    pushl   %ebp
    pushl   %ebx
    pushl   %edi
    pushl   %esi
    pushl   %eax
    movl    36(%esp), %ecx
    movl    32(%esp), %esi
    movl    28(%esp), %edi
    movl    24(%esp), %ebx
    movl    %ebx, %eax
    orl %edi, %eax
    je  .LBB0_1
    movl    44(%esp), %ebp
    movl    40(%esp), %eax
    movl    %eax, (%esp)            # 4-byte Spill
.LBB0_3:                                # %if.end
    movl    %ebp, %edx
    movl    (%esp), %eax            # 4-byte Reload
    addl    $-1, %ebx
    adcl    $-1, %edi
    addl    %eax, %esi
    adcl    %edx, %ecx
    movl    %ebx, %ebp
    orl %edi, %ebp
    movl    %esi, (%esp)            # 4-byte Spill
    movl    %ecx, %ebp
    movl    %eax, %esi
    movl    %edx, %ecx
    jne .LBB0_3
    jmp .LBB0_4
.LBB0_1:
    movl    %esi, %eax
    movl    %ecx, %edx
.LBB0_4:                                # %return
    addl    $4, %esp
    popl    %esi
    popl    %edi
    popl    %ebx
    popl    %ebp
    retl


main:                                   # @main
    subl    $28, %esp
    movl    $0, 20(%esp)
    movl    $1, 16(%esp)
    movl    $0, 12(%esp)
    movl    $0, 8(%esp)
    movl    $2, 4(%esp)
    movl    $1410065408, (%esp)     # imm = 0x540BE400
    calll   _Z8fib_tailyyy
    movl    %edx, f+4
    movl    %eax, f
    xorl    %eax, %eax
    addl    $28, %esp
    retl

Same applies to gcc 4.9.2 if you give it -O2 (but not in -O1 which was all clang needed)

(And of course also in 64-bit mode)

like image 77
Mats Petersson Avatar answered Sep 24 '22 15:09

Mats Petersson