What is the reason for the catastrophic performance of pow()
for NaN values? As far as I can work out, NaNs should not have an impact on performance if the floating-point math is done with SSE instead of the x87 FPU.
This seems to be true for elementary operations, but not for pow()
. I compared multiplication and division of a double to squaring and then taking the square root. If I compile the piece of code below with g++ -lrt
, I get the following result:
multTime(3.14159): 20.1328ms
multTime(nan): 244.173ms
powTime(3.14159): 92.0235ms
powTime(nan): 1322.33ms
As expected, calculations involving NaN take considerably longer. Compiling with g++ -lrt -msse2 -mfpmath=sse
however results in the following times:
multTime(3.14159): 22.0213ms
multTime(nan): 13.066ms
powTime(3.14159): 97.7823ms
powTime(nan): 1211.27ms
The multiplication / division of NaN is now much faster (actually faster than with a real number), but the squaring and taking the square root still takes a very long time.
Test code (compiled with gcc 4.1.2 on 32bit OpenSuSE 10.2 in VMWare, CPU is a Core i7-2620M)
#include <iostream>
#include <sys/time.h>
#include <cmath>
void multTime( double d )
{
struct timespec startTime, endTime;
double durationNanoseconds;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &startTime);
for(int i=0; i<1000000; i++)
{
d = 2*d;
d = 0.5*d;
}
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &endTime);
durationNanoseconds = 1e9*(endTime.tv_sec - startTime.tv_sec) + (endTime.tv_nsec - startTime.tv_nsec);
std::cout << "multTime(" << d << "): " << durationNanoseconds/1e6 << "ms" << std::endl;
}
void powTime( double d )
{
struct timespec startTime, endTime;
double durationNanoseconds;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &startTime);
for(int i=0; i<1000000; i++)
{
d = pow(d,2);
d = pow(d,0.5);
}
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &endTime);
durationNanoseconds = 1e9*(endTime.tv_sec - startTime.tv_sec) + (endTime.tv_nsec - startTime.tv_nsec);
std::cout << "powTime(" << d << "): " << durationNanoseconds/1e6 << "ms" << std::endl;
}
int main()
{
multTime(3.14159);
multTime(NAN);
powTime(3.14159);
powTime(NAN);
}
Edit:
Unfortunately, my knowledge on this topic is extremely limited, but I guess that the glibc pow()
never uses SSE on a 32bit system, but rather some assembly in sysdeps/i386/fpu/e_pow.S
. There is a function __ieee754_pow_sse2
in more recent glibc versions, but it's in sysdeps/x86_64/fpu/multiarch/e_pow.c
and therefore probably only works on x64. However, all of this might be irrelevant here, because pow()
is also a gcc built-in function. For an easy fix, see Z boson's answer.
"NaNs should not have an impact on performance if the floating-point math is done with SSE instead of the x87 FPU."
I'm not sure this follows from the resource you quote. In any case, pow
is a C library function. It is not implemented as an instruction, even on x87. So there are 2 separate issues here - how SSE handles NaN
values, and how a pow
function implementation handles NaN
values.
If the pow
function implementation uses a different path for special values like +/-Inf
, or NaN
, you might expect a NaN
value for the base, or exponent, to return a value quickly. On the other hand, the implementation might not handle this as a separate case, and simply relies on floating-point operations to propagate intermediate results as NaN
values.
Starting with 'Sandy Bridge', many of the performance penalties associated with denormals were reduced or eliminated. Not all though, as the author describes a penalty for mulps
. Therefore, it would be reasonable to expect that not all arithmetic operations involving NaNs
are 'fast'. Some architectures might even revert to microcode to handle NaNs
in different contexts.
Your math library is too old. Either find another math library which implements pow with NAN better or implement a fix like this:
inline double pow_fix(double x, double y)
{
if(x!=x) return x;
if(y!=y) return y;
return pow(x,y);
}
Compile with g++ -O3 -msse2 -mfpmath=sse foo.cpp
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With