Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimisation of division in gcc

Here's some code (full program follows later in the question):

template <typename T>
T fizzbuzz(T n) {
    T count(0);
    #if CONST
        const T div(3);
    #else
        T div(3);
    #endif
    for (T i(0); i <= n; ++i) {
        if (i % div == T(0)) count += i;
    }
    return count;
}

Now, if I call this template function with int, then I get a factor of 6 performance difference according to whether I define CONST or not:

$ gcc --version
gcc (GCC) 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)

$ make -B wrappedint CPPFLAGS="-O3 -Wall -Werror -DWRAP=0 -DCONST=0" &&
 time ./wrappedint
g++  -O3 -Wall -Werror -DWRAP=0 -DCONST=0   wrappedint.cpp   -o wrappedi
nt
484573652

real    0m2.543s
user    0m2.059s
sys     0m0.046s

$ make -B wrappedint CPPFLAGS="-O3 -Wall -Werror -DWRAP=0 -DCONST=1" &&
 time ./wrappedint
g++  -O3 -Wall -Werror -DWRAP=0 -DCONST=1   wrappedint.cpp   -o wrappedi
nt
484573652

real    0m0.655s
user    0m0.327s
sys     0m0.046s

Examining the disassembly shows that in the fast (const) case, the modulo has been turned into a multiplication and shift type thing, whereas in the slow (non-const) case it's using idivl.

Even worse, if I try to wrap my integer in a class, then the optimisation doesn't happen whether I use const or not. The code always uses idivl and runs slow:

#include <iostream>

struct WrappedInt {
    int v;
    explicit WrappedInt(const int &val) : v(val) {}
    bool operator<=(const WrappedInt &rhs) const { return v <= rhs.v; }
    bool operator==(const WrappedInt &rhs) const { return v == rhs.v; }
    WrappedInt &operator++() { ++v; return *this; }
    WrappedInt &operator+=(const WrappedInt &rhs) { v += rhs.v; return *this; }
    WrappedInt operator%(const WrappedInt &rhs) const 
        { return WrappedInt(v%rhs.v); }
};

std::ostream &operator<<(std::ostream &s, WrappedInt w) {
    return s << w.v;
}

template <typename T>
T fizzbuzz(T n) {
    T count(0);
    #if CONST
        const T div(3);
    #else
        T div(3);
    #endif
    for (T i(0); i <= n; ++i) {
        if (i % div == T(0)) count += i;
    }
    return count;
}

int main() {
    #if WRAP
        WrappedInt w(123456789);
        std::cout << fizzbuzz(w) << "\n";
    #else
        std::cout << fizzbuzz<int>(123456789) << "\n";
    #endif
}

My questions are:

1) Is there a simple principle of C++ itself, or gcc's optimisation, which explains why this happens, or is it just a case of "various heuristics run, this is the code you get"?

2) Is there any way to make the compiler realise that my locally-declared and never-referenced const WrappedInt can be treated as a compile-time const value? I want this thing to be a straight replacement for int in templates.

3) Is there a known way of wrapping an int such that the compiler can discard the wrapping when optimising? The goal is that WrappedInt will be a policy-based template. But if a "do-nothing" policy results in essentially arbitrary 6x speed penalties over int, I'm better off special-casing that situation and using int directly.

like image 884
Steve Jessop Avatar asked Jul 13 '09 20:07

Steve Jessop


People also ask

How do I use optimization in GCC?

The -O level option to gcc turns on compiler optimization, when the specified value of level has the following effects: 0. The default reduces compilation time and has the effect that debugging always yields the expected result. This level is equivalent to not specifying the -O option at all.

Do compilers optimize division by 2?

Yes, compilers generate the most optimal code for such simplistic calculations.

What is optimization GCC?

GCC has a range of optimization levels, plus individual options to enable or disable particular optimizations. The overall compiler optimization level is controlled by the command line option -On, where n is the required optimization level, as follows: -O0 . (default). No optimization is performed.

Is GCC an optimizing compiler?

GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O , this option increases both compilation time and the performance of the generated code.


1 Answers

I'm guessing its just the severely old GCC version you are running. The oldest compiler I have on my machine - gcc-4.1.2, performs the fast way with both the non-const and the wrap versions (and does so at only -O1).

like image 106
Greg Rogers Avatar answered Sep 24 '22 19:09

Greg Rogers