I'm distributing a C++ program with a makefile for the Unix version, and I'm wondering what compiler options I should use to get the fastest possible code (it falls into the category of programs that can use all the computing power they can get and still come back for more), given that I don't know in advance what hardware, operating system or gcc version the user will have, and I want above all else to make sure it at least works correctly on every major Unix-like operating system.
Thus far, I have g++ -O3 -Wno-write-strings
, are there any other options I should add? On Windows, the Microsoft compiler has options for things like fast calling convention and link time code generation that are worth using, are there any equivalents on gcc?
(I'm assuming it will default to 64-bit on a 64-bit platform, please correct me if that's not the case.)
Without knowing any specifics on your program it's hard to say. O3 covers most of the optimisations. The remaining options come "at a cost". If you can tolerate some random rounding and your code isn't dependent on IEEE floating point standards then you can try -Ofast. This disregards standards compliance and can give you faster code.
The remaining optimisations flags can only improve performance of certain programs, but can even be detrimental to others. Look at the available flags in the gcc documentation on optimisation flags and benchmark them.
Another option is to enable C99 (-std=c99) and inline appropriate functions. This is a bit of an art, you shouldn't inline everything, but with a little work you can get your code to be faster (albeit at the cost of having a larger executable).
If speed is really an issue I would suggest either going back to Microsoft's compiler, or to try Intel's. I've come to appreciate how slow some gcc compiled code can be, especially when it involves math.h.
EDIT: Oh wait, you said C++? Then disregard my C99 paragraph, you can inline already :)
I would try profile guided optimization:
-fprofile-generate
Enable options usually used for instrumenting application to produce profile useful for later recompilation with profile feedback based optimization. You must use-fprofile-generate
both when compiling and when linking your program. The following options are enabled:-fprofile-arcs
,-fprofile-values
,-fvpt
.
You should also give the compiler hints about the architecture on which the program will run.
For example if it will only run on a server and you can compile it on the same machine as the server, you can just use -march=native
.
Otherwise you need to determine which features your users will all have and pass the corresponding parameter to GCC.
(Apparently you're targeting 64-bit, so GCC will probably already include more optimizations than for generic x86.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With