Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can a const expr be evaluated so fast

I have been trying out const expressions which are evaluated at compile time. But I played with an example that seems incredibly fast when executed at compile time.

#include<iostream> 

constexpr long int fib(int n) { 
    return (n <= 1)? n : fib(n-1) + fib(n-2); 
} 

int main () {  
    long int res = fib(45); 
    std::cout << res; 
    return 0; 
} 

When I run this code it takes about 7 seconds to run. So far so good. But when I change long int res = fib(45) to const long int res = fib(45) it takes not even a second. To my understanding it is evaluated at compile time. But the compilation takes about 0.3 seconds

How can the compiler evaluate this so quickly, but at runtime it takes so much more time? I'm using gcc 5.4.0.

like image 368
Peter234 Avatar asked Jan 23 '20 07:01

Peter234


2 Answers

The compiler caches smaller values and does not need to recompute as much as the runtime version does.
(The optimiser is very good and generates lots of code including trickery with special cases that are incomprehensible to me; the naive 2^45 recursions would take hours.)

If you also store previous values:

int cache[100] = {1, 1};

long int fib(int n) {
    int res = cache[n];
    return res ? res : (cache[n] = fib(n-1) + fib(n-2));
} 

the runtime version is much faster than the compiler.

like image 69
molbdnilo Avatar answered Oct 19 '22 20:10

molbdnilo


You may find interesting with 5.4 the function is not completely eliminated, you need at least 6.1 for that.

I do not think there is any caching happening. I am convinced the optimizer is smart enough to prove relationship between fib(n - 2) and fib(n-1) and avoids the second call completely. This is GCC 5.4 output (obtained from godbolt) with no constexpr and -O2:

fib(long):
        cmp     rdi, 1
        push    r12
        mov     r12, rdi
        push    rbp
        push    rbx
        jle     .L4
        mov     rbx, rdi
        xor     ebp, ebp
.L3:
        lea     rdi, [rbx-1]
        sub     rbx, 2
        call    fib(long)
        add     rbp, rax
        cmp     rbx, 1
        jg      .L3
        and     r12d, 1
.L2:
        lea     rax, [r12+rbp]
        pop     rbx
        pop     rbp
        pop     r12
        ret
.L4:
        xor     ebp, ebp
        jmp     .L2

I have to admit I do not understand the output with -O3 - the code generated is surprisingly complex, with lots of memory accesses and pointer arithmetics and it is quite possible there is some caching (memoization) done with those settings.

like image 45
Suma Avatar answered Oct 19 '22 20:10

Suma