Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is compile-time execution significantly faster than runtime execution?

Contrary to what this question says, this piece of code is exhibiting some weird behaviour:

long long int fibonacci(int num) {
    if (num <= 2) return 1;
    return fibonacci(num - 1) + fibonacci(num - 2);
}

int main() {
    auto t1 = std::chrono::high_resolution_clock::now();
    long long int x = fibonacci(45);
    auto t2 = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double, std::milli> time(t2 - t1);
    std::cout << "Time taken: " << time.count() << "ms";
}

On my machine, this compiles in ~700ms with -O3 (GCC) and the output is:

Time taken: 2667.55ms

I rewrote the above code with constexpr as follows:

constexpr long long int fibonacci(int num) {
    if (num <= 2) return 1;
    return fibonacci(num - 1) + fibonacci(num - 2);
}

int main() {
    auto t1 = std::chrono::high_resolution_clock::now();
    constexpr long long int x = fibonacci(45);
    auto t2 = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double, std::milli> time(t2 - t1);
    std::cout << "Time taken: " << time.count() << "ms";
}

Which compiles in roughly the same time but the output is:

Time taken: 0ms

As it stands, evaluating fibonacci(45) at compile-time is much faster than evaluating it at runtime. To eliminate multi-core compiling as a reason (which definitely isn't), I re-ran the second block above with fibonacci(60). Again, the code compiles in the same amount of time but the output is:

Time taken: 29499.6ms

What causes this big performance gap?

like image 396
Wais Kamal Avatar asked Sep 25 '21 17:09

Wais Kamal


2 Answers

Basically, for that short piece of code, compile time does not matter.

Even, if you do compile time evaluation.

The main problem is here the utmost bad algorithm used here. Using 2 recursive calls which will then call again 2 recursive functions and so on and so on has really the worst case time complexity for such a simple algorithm.

That problem can and must be solved with an iterative approach.

Something like that:

unsigned long long fibonacci(size_t index) noexcept {
    unsigned long long f1{ 0ull }, f2{ 1ull }, f3{};
    while (index--) { f3 = f2 + f1; f1 = f2; f2 = f3; }
    return f2;
}

If you use this function, then, reagrdless of optimization or not, your main will run in below 1ms.

In you original main function, if you do not use the result of the calculation, so the variable "X", then optimizing compiler can eliminate the complete function call. It is not necessary to call that function. The result is not used.

Make the following experiment.

Add std::cout << x << '\n'; as the last statement in your main function. You will be surprised. With enabled optimization, the function will be called at the very end. And you time measurement measures nothing.

Now to the compiler time version. Also the compiler will internally used optimized code. And it will convert the nonesense recursive approach to an iterative approach. And to calculate that values will take less time than the compiler overhead functions.

So, that is the real reason.

And the Fibonacci numbers can always be compiled fully at compile time. There are only 93 possible results for an 64 bit result value.

See the following approach, which creates a compile time array of all Fibonacci numbers. And if you want the n'th number, it is just a lookup value.

This will work very fast in either non-optimized or optimized mode.

It is the fasted possible solution for that problem.

#include <iostream>
#include <utility>
#include <array>
#include <algorithm>
#include <iterator>

// All done during compile time -------------------------------------------------------------------
constexpr unsigned long long getFibonacciNumber(size_t index) noexcept {
    unsigned long long f1{ 0ull }, f2{ 1ull }, f3{};
    while (index--) { f3 = f2 + f1; f1 = f2; f2 = f3; }
    return f2;
}
// Some helper to create a constexpr std::array initilized by a generator function
template <typename Generator, size_t ... Indices>
constexpr auto generateArrayHelper(Generator generator, std::index_sequence<Indices...>) {
    return std::array<decltype(std::declval<Generator>()(size_t{})), sizeof...(Indices) > { generator(Indices)... };
}
template <size_t Size, typename Generator>
constexpr auto generateArray(Generator generator) {
    return  generateArrayHelper(generator, std::make_index_sequence<Size>());
}
constexpr size_t MaxIndexFor64BitFibonacci = 93;

// This is the definition of a std::array<unsigned long long, 93> with all Fibonacci numbers in it
constexpr auto Fib = generateArray<MaxIndexFor64BitFibonacci>(getFibonacciNumber);
// End of: All done during compile time -----------------------------------------------------------


// The only code executed during runtime
int main() {

    // Show all Fibonacci numbers up to border value
    for (const auto fib : Fib) std::cout << fib << '\n';
}
like image 200
Armin Montigny Avatar answered Oct 17 '22 17:10

Armin Montigny


I'm not sure why this question is even appears unless you didn't exactly read up how constexpr is described. You can't work by "guessing" what's going on in program written in C++ unless the behaviour is part of UB or platform-defined.

C++ standard doesn't describe memoization in any way, thus we have no reason to suspect that such complex caching procedure may be used. Languages which support memoization (or things called "enterring", "tombstoning", and similar) are usually the same that are running intermediate code on some virtual machine and de-facto do not provide mechanics for compile-time evaluation of an arbitrary function as such never can be done for unknown platform.

constexpr allows creation of "constants" which are initialized through code executed on compiler's side (with all possible platform-defined effects and all restrictions applied in related standard).

Without it as-if rule applies, which includes a freedom for compiler not to do anything about changing code. Compiler is also allowed to throw whole thing way if result is not observable and it is clear that there is no side-effects of function call, which is easier to check when constexpr is used, i.e. there are no side-effects by definition.

In constexpr version whole execution stack would be unwinded because you explicitly told compiler that it is possible. The limitations on constexpr functions imposed by standard guarantee that function would be fit for such evaluation.

There can be unexpected results when cross-compiling for platform which got behaviour different from host one: you may get constexpr values for compiler-side platform.

With recursive constexpr functions you risk to exceed limit of recursions which is implementation-defined. E.g. your case wouldn't compile with some builds of g++ at all as you have more than 32 millions of iterations there due non-optimal recursion. THe way recursion is done, there is no memoization is possible on compile time. When it is possible? If f(x2) is calling f(x1), after f(x1) was called, evaluation of f(x2) might use already known value of f(x1). In your case fibonacci(x) calls fibonacci(x-1) and fibonacci(x-2) for any given x greater than 2. Those two values weren't evaluated before. It could look so that fibonacci(x-1) may reuse or provide fibonacci(x-2)'s value, but depending on C++ standard's version it's either:

A.) order of execution is undefined;
B.) fibonacci(x-1) is a context different from context of fibonacci(x).

In second case no code is executed between calls to chrono aside of initialization of x with value calculated by compiler. If you used static constexpr long long int x for the local variable, then even initialization would be gone, it would be a hard-coded constant value somewhere in data or in code where it's used. static formally prolongs life of object till the end of process, while an automatic constexpr variable would be initialized each time the function was entered (irrelevant to fact that main() should be entered only once).

like image 2
Swift - Friday Pie Avatar answered Oct 17 '22 17:10

Swift - Friday Pie