I am working on Nehalam/westmere Intel micro architecture CPU. I want to optimize my code for this Architecture. Are there any specialized compilation flags or C functions by GCC which will help me improve my code's run time performance?
I am already using -o3.
Language of the Code - C
Platform - Linux
GCC Version - 4.4.6 20110731 (Red Hat 4.4.6-3) (GCC)
In my code I have some floating point comparison and they are done over a million time.
Please assume the code is already best optimized.
First, if you really want to profit from optimization on newer processors like this one, you should install the newest version of the compiler. 4.4 came out some years ago, and even if it still seems maintainted, I doubt that the newer optimization code is backported to that. (Current version is 4.7)
Gcc has a catch-all optimization flag that usually should produce code that is optimized for the compilation architecture: -march=native
. Together with -O3
this should be all that you need.
(And for future question on this site, please use complete English grammar and punctuation.)
Warning: the answer is incorrect.
You can actually analyze all disabled and enabled optimizations yourself. Run on your computer:
gcc -O3 -Q --help=optimizers | grep disabled
And then read about the flags that are still disabled and can according to the gcc documentation influence performance.
You'll want to add an -march=...
option. The ...
should be replaced with whatever is closest to your CPU architecture (there tend to be minor differences) described in the i386/x86_64 options for GCC here.
I would use core2
because corei7
(the one you'd want) is only available in GCC 4.6 and later. See the arch list for GCC 4.6 here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With