Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ operator overload performance issue

Consider following scheme. We have 3 files:

main.cpp:

int main() {   
    clock_t begin = clock();
    int a = 0;
    for (int i = 0; i < 1000000000; ++i) {
        a += i;
    }
    clock_t end = clock();
    printf("Number: %d, Elapsed time: %f\n",
            a, double(end - begin) / CLOCKS_PER_SEC);

    begin = clock();
    C b(0);
    for (int i = 0; i < 1000000000; ++i) {
        b += C(i);
    }
    end = clock();
    printf("Number: %d, Elapsed time: %f\n",
            a, double(end - begin) / CLOCKS_PER_SEC);
    return 0;
}

class.h:

#include <iostream>
struct C {
public:
    int m_number;
    C(int number);
    void operator+=(const C & rhs);
};

class.cpp

C::C(int number)
: m_number(number)
{
}
void 
C::operator+=(const C & rhs) {
    m_number += rhs.m_number;
}

Files are compiled using clang++ with flags -std=c++11 -O3.

What I expected were very similar performance results, since I thought that compiler will optimize the operators not to be called as functions. The reality though was a bit different, here is the result:

Number: -1243309312, Elapsed time: 0.000003
Number: -1243309312, Elapsed time: 5.375751

I played around a bit and found out, that if I paste all of the code from class.* into the main.cpp the speed dramatically improves and results are very similar.

Number: -1243309312, Elapsed time: 0.000003
Number: -1243309312, Elapsed time: 0.000003

Than I realized that this behavior is probably caused by the fact, that compilation of main.cpp and class.cpp is completely separated and therefore compiler is unable to perform adequate optimizations.

My question: Is there any way of keeping the 3-file scheme and still achieve the optimization level as if the files were merged into one and than compiled? I have read something about 'unity builds' but that seems like an overkill.

like image 731
Jendas Avatar asked Jul 24 '14 09:07

Jendas


People also ask

What are the limitations of operator overloading?

1) Only built-in operators can be overloaded. New operators can not be created. 2) Arity of the operators cannot be changed. 3) Precedence and associativity of the operators cannot be changed.

Can operators be overloaded in C?

C does not support operator overloading at all. You can only implement operations as functions: Colour colour_add(Colour c1, Colour c2); Colour colour_substract(Colour c1, Colour c2); ...

Which operator in C Cannot be overload?

(::) Scope resolution operator cannot be overloaded in C language.

Why is it necessary to overload an operator in C?

The need for operator overloading in C++It allows us to provide an intuitive interface to our class users, plus makes it possible for templates to work equally well with classes and built-in types. Operator overloading allows C++ operators to have user-defined meanings on user-defined types or classes.


2 Answers

Short answer

What you want is link time optimization. Try the answer from this question. I.e., try:

clang++ -O4 -emit-llvm main.cpp -c -o main.bc 
clang++ -O4 -emit-llvm class.cpp -c -o class.bc 
llvm-link main.bc class.bc -o all.bc
opt -std-compile-opts -std-link-opts -O3 all.bc -o optimized.bc
clang++ optimized.bc -o yourExecutable

You should see that your performance reaches the one that you had when pasting everything into main.cpp.

Long answer

The problem is that the compiler cannot inline your overloaded operator during linking, because it no longer has its definition in a form which it can use to inline it (it cannot inline bare machine code). Thus, the operator call in main.cpp will stay a real function call to the function declared in class.cpp. A function call is very expensive in comparison to a simple inlined addition which can be optimized further (e.g., vectorized).

When you enable link time optimization, the compiler is able to do this. As you see above, you first create llvm intermediate representation byte code (the .bc files, which I will simply call llvm code hereinafter) instead of machine code. You then link these files to a new .bc file which still contains llvm code instead of machine code. In contrast to machine code, the compiler is able to perform inlining on llvm code. opt is the llvm optimizer (be sure to install llvm), which performs the inlining and further link time optimizations. Then, we call clang++ a final time to generate executable machine code from the optimized llvm code.

For People with GCC

The answer above is only for clang. GCC (g++) users must use the -flto flag during compilation and during linking to enable link time optimization. It is simpler than with clang, simply add -flto everywhere:

      g++ -c -O2 -flto main.cpp
      g++ -c -O2 -flto class.cpp
      g++ -o myprog -flto -O2 main.o class.o
like image 177
gexicide Avatar answered Oct 14 '22 03:10

gexicide


The technique what you are looking for is called Link Time Optimization.

like image 32
erenon Avatar answered Oct 14 '22 01:10

erenon