Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Integer powers in C

In C code it is common to write

a = b*b;

instead of

a = pow(b, 2.0);

for double variables. I get that since pow is a generic function capable of handling non-integer exponents, one should naïvely think that the first version is faster. I wonder however whether the compiler (gcc) transforms calls to pow with integer exponents to direct multiplication as part of any of the optional optimizations.

Assuming that this optimization does not take place, what is the largest integer exponent for which it is faster to write out the multiplication manually, as in b*b* ... *b?

I know that I could make performance tests on a given machine to figure out whether I should even care, but I would like to gain some deeper understanding on what is "the right thing" to do.

like image 599
jmd_dk Avatar asked Sep 04 '17 15:09

jmd_dk


1 Answers

What you want is -ffinite-math-only -ffast-math and possibly #include <tgmath.h> This is the same as -Ofast without mandating the -O3 optimizations.

Not only does it help these kinds of optimizations when -ffinite-math-only and -ffast-math is enabled, the type generic math also helps compensate for when you forget to append the proper suffix to a (non-double) math function.

For example:

#include <tgmath.h>
float pow4(float f){return pow(f,4.0f);}
//compiles to
pow4:
    vmulss  xmm0, xmm0, xmm0
    vmulss  xmm0, xmm0, xmm0
    ret

For clang this works for powers up to 32, while gcc does this for powers up to at least 2,147,483,647 (that's as far as I checked) unless -Os is enabled (because a jmp to the pow function is technically smaller) - with -Os, it will only do a power of 2.

WARNING -ffast-math is just a convenience alias to several other optimizations, many of which break all kinds of standards. If you'd rather use only the minimal flags to get this desired behavior, then you can use -fno-math-errno -funsafe-math-optimizations -ffinite-math-only

like image 200
technosaurus Avatar answered Nov 01 '22 17:11

technosaurus