Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the optimization level ( g++ ) you use while comparing two different algorithms written in C++ ?

I have two algorithms written in C++. As far as I know, it is conventional to compile with
-O0 -NDEBUG (g++) while comparing the performance of two algorithms(asymptotically they are same).
But I think the optimization level is unfair to one of them, because it uses STL in every case. The program which uses plain array outperforms the STL-heavy algorithm 5 times faster while compiled with -O0 options. But the performance difference is not much different when I compile them with -O2 -NDEBUG.

Is there any way to get the best out of STL (I am getting heavy performance hit in the vector [] operator) in optimization level -O0?

What optimization level (and possibly variables like -NDEBUG) do you use while comparing two algorithms?

It will be also great help if someone can give some idea about the trend in academic research about comparing the performance of algorithms written in C++?


Ok, To isolate the problem of optimization level, I am using one algorithm but two different implementation now.

I have changed one of the functions with raw pointers(int and boolean) to std::vector and std::vector... With -O0 -NDEBUG the performances are 5.46s(raw pointer) and 11.1s(std::vector). And with -O2 -NDEBUG , the performances are 2.02s(raw pointer) and 2.21s(std::vector). Same algorithm, one implementation is using 4/5 dynamic arrays of int and boolean. And the other one is using using std::vector and std::vector instead. They are same in every other case

You can see that in -O0 std::vector is outperformed with twice faster pointers. While in -O2 they are almost the same.

But I am really confused, because in academic fields, when they publish the results of algorithms in running time, they compile the programs with -O0.

Is there some compiler options I am missing?

like image 243
hasan Avatar asked Oct 03 '09 06:10

hasan


People also ask

What is optimization level in C?

The degree to which the compiler will optimize the code it generates is controlled by the -O flag. No optimization. In the absence of any version of the -O flag, the compiler generates straightforward code with no instruction reordering or other attempt at performance improvement. -O or -O2.

How many levels of optimization are there?

An optimization level is chosen with the command line option -O LEVEL , where LEVEL is a number from 0 to 3. The effects of the different optimization levels are described below: -O0 or no -O option (default)

What is optimization level for gcc compiler?

GCC has a range of optimization levels, plus individual options to enable or disable particular optimizations. The overall compiler optimization level is controlled by the command line option -On, where n is the required optimization level, as follows: -O0 . (default).

What is O3 in C?

-O3 : the highest level of optimization possible. It enables optimizations that are expensive in terms of compile time and memory usage. Compiling with -O3 is not a guaranteed way to improve performance, and in fact, in many cases, can slow down a system due to larger binaries and increased memory usage.


2 Answers

It depends on what you want to optimize for.

Speed

I suggest using -O2 -NDEBUG -ftree-vectorize, and if your code is designed to specifically run on x86 or x86_64, add -msse2. This will give you a broad idea on how it will perform with GIMPLE.

Size

I believe you should use -Os -fno-rtti -fno-exceptions -fomit-frame-pointer. This will minimize the size of the executable to a degree (assuming C++).


In both cases, algorithm's speed is not compiler dependent, but a compiler can drastically change the way the code behaves if it can "prove" it can.

GCC detects 'common' code such as hand-coded min() and max() and turns them into one SSE instruction (on x86/x86_64 and when -msse is set) or using cmov when i686 is available (SSE has higher priority). GCC will also take liberty in reordering loops, unrolling and inlining functions if it wants to, and even remove useless code.

As for your latest edit:

You can see that in -O0 std::vector is outperformed with twice faster pointers. While in -O2 they are almost the same.

That's because std::vector still has code that throws exceptions and may use rtti. Try comparing with -O2 -NDEBUG -ftree-vectorize -fno-rtti -fno-exceptions -fomit-frame-pointer, and you'll see that std::vector will be slightly better than your code. GCC knows what 'built-in' types are and how to exploit them in real world use and will gladly do so - just like it knows what memset() and memcpy() does and how to optimize accordingly when copy size is known.

like image 63
LiraNuna Avatar answered Sep 22 '22 07:09

LiraNuna


The compiler optimizations usually won't change the complexity order of an algorithm, just the constant and the linear scale factor. Compilers are fairly smart, but they're not that smart.

Are you going to be compiling your code for release with just -O0? Probably not. You might as well compare the performance of the algorithms when compiled with whatever compilation flags you actually intend to use.

like image 40
Boojum Avatar answered Sep 21 '22 07:09

Boojum