Our server application does a lot of integer tests in a hot code path, currently we use the following function:
inline int IsInteger(double n)
{
return n-floor(n) < 1e-8
}
This function is very hot in our workload, so I want it to be as fast as possible. I also want to eliminate the "floor" library call if I can. Any suggestions?
Here are a couple of answers:
#include <stdint.h>
#include <stdio.h>
#include <math.h>
int IsInteger1(double n)
{
union
{
uint64_t i;
double d;
} u;
u.d = n;
int exponent = ((u.i >> 52) & 0x7FF) - 1023;
uint64_t mantissa = (u.i & 0x000FFFFFFFFFFFFFllu);
return n == 0.0 ||
exponent >= 52 ||
(exponent >= 0 && (mantissa << (12 + exponent)) == 0);
}
int IsInteger2(double n)
{
return n - (double)(int)n == 0.0;
}
int IsInteger3(double n)
{
return n - floor(n) == 0.0;
}
And a test harness:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
int IsInteger1(double);
int IsInteger2(double);
int IsInteger3(double);
#define TIMEIT(expr, N) \
gettimeofday(&start, NULL); \
for(i = 0; i < N; i++) \
{ \
expr; \
} \
gettimeofday(&end, NULL); \
printf("%s: %f\n", #expr, (end.tv_sec - start.tv_sec) + 0.000001 * (end.tv_usec - start.tv_usec))
int main(int argc, char **argv)
{
const int N = 100000000;
struct timeval start, end;
int i;
double d = strtod(argv[1], NULL);
printf("d=%lf %d %d %d\n", d, IsInteger(d), IsInteger2(d), IsInteger3(d));
TIMEIT((void)0, N);
TIMEIT(IsInteger1(d), N);
TIMEIT(IsInteger2(d), N);
TIMEIT(IsInteger3(d), N);
return 0;
}
Compile as:
gcc isinteger.c -O3 -c -o isinteger.o
gcc main.c isinteger.o -o isinteger
My results, on an Intel Core Duo:
$ ./isinteger 12345
d=12345.000000 1 1 1
(void)0: 0.357215
IsInteger1(d): 2.017716
IsInteger2(d): 1.158590
IsInteger3(d): 2.746216
Conclusion: the bit twiddling isn't as fast as I might have guessed. The extra branches are probably what kills it, even though it avoids floating-point operations. FPUs are fast enough these days that doing a double
-to-int
conversion or a floor
really isn't that slow.
A while back I ran a bunch of timings on the most efficient way to convert between floats and integers, and wrote them up. I also timed techniques for rounding.
The short story for you is: converting from a float to an int, or using union hacks, is unlikely to be an improvement due to a CPU hazard called a load-hit-store -- unless the floats are coming from RAM and not a register.
Because it is an intrinsic, abs(floor(x)-eps) is probably the fastest solution. But because this is all very sensitive to the particular architecture of your CPU -- depending on very sensitive things like pipeline depth and store forwarding -- you'll need to time a variety of solutions to find one that is really optimal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With