I'm just playing around with gcc (g++) and the compilerflags -msse and -msse2. I have a little test program which looks like that:
#include <iostream>
int main(int argc, char **argv) {
float a = 12558.5688;
float b = 6.5585;
float result = 0.0;
result = a * b;
std::cout << "Result: " << result << std::endl;
return 0;
}
When I compile it with the following statements:
/usr/local/bin/g++-4.9 -W -msse main.cpp -o testsse
and
/usr/local/bin/g++-4.9 -W -msse2 main.cpp -o testsse2
the output files are binary equal. But I've expected that they are not the same because of the SMID flags.
So my question is, do those complier flags do not have any influence on the binary file? I've tested it on OS X 10.10.3 and Fedora 21.
Thanks for your help.
Kind regards
Fabian
The first thing you need to know is that SSE2 and SSE are enabled and used by default for 64-bit code. For 32-bit code the default was x87 instructions.
The second thing you need to know is that double floating requires SSE2 so if you want to see a difference between SSE and SSE2 in your example you should compare double with float.
The third thing you need to know is how to convince your compiler to not optimize your calculations away. One way to do this is to wrap your code in functions like this:
//foo.cpp
float foof(float x, float y) {
return x*y;
}
double food(double x, double y) {
return x*y;
}
then g++ -O3 -S foo.cpp
shows that foof
uses mulss
whereas food
uses mulsd
. If you want to make sure it's getting the right results you can link it in like this
//main.cpp
#include <iostream>
extern float foof(float, float);
extern double food(double, double);
int main(void) {
float af = 12558.5688;
float bf = 6.5585;
float resultf = 0.0;
double ad = af;
double bd = bf;
double resultd = 0.0;
resultf = foof(af, bf);
resultd = food(ad, bd);
std::cout << "Resultf: " << resultf << " Resultd: " << resultd << std::endl;
}
Then do g++ -O3 -c foo.cpp
and then g++ -O3 main.cpp foo.o
.
If you want to disable SSE instructions then use -mfpmath=387
or compile in 32-bit mode with -m32
.
In your code very basic floating point maths is involved. And I bet if you turn optimizations on (even -O1
) it gets optimized out because those values are constant expressions and so calculable at compile-time.
SSE is used (movss
, mulss
) because it's the threshold of floating point calculus, if we want. SSE2 has no scope here.
In order to find room for SSE2 you need to include more complex calculus which may or may not exploit some instructions available in SSE2; you could look up what some do, do their equivalent and see if the compiler can take advantage of them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With