I am trying to implement a fixed-point class in C++, but I face problems with performance. I have reduced the problem to a simple wrapper of the float type and it is still slow. My question is - why is the compiler unable optimize it fully?
The 'float' version is 50% faster than 'Float'. Why?!
(I use Visual C++ 2008, all possible compiler's options tested, Release configuration of course).
See the code below:
#include <cstdio>
#include <cstdlib>
#include "Clock.h" // just for measuring time
#define real Float // Option 1
//#define real float // Option 2
struct Float
{
private:
float value;
public:
Float(float value) : value(value) {}
operator float() { return value; }
Float& operator=(const Float& rhs)
{
value = rhs.value;
return *this;
}
Float operator+ (const Float& rhs) const
{
return Float( value + rhs.value );
}
Float operator- (const Float& rhs) const
{
return Float( value - rhs.value );
}
Float operator* (const Float& rhs) const
{
return Float( value * rhs.value );
}
bool operator< (const Float& rhs) const
{
return value < rhs.value;
}
};
struct Point
{
Point() : x(0), y(0) {}
Point(real x, real y) : x(x), y(y) {}
real x;
real y;
};
int main()
{
// Generate data
const int N = 30000;
Point points[N];
for (int i = 0; i < N; ++i)
{
points[i].x = (real)(640.0f * rand() / RAND_MAX);
points[i].y = (real)(640.0f * rand() / RAND_MAX);
}
real limit( 20 * 20 );
// Check how many pairs of points are closer than 20
Clock clk;
int count = 0;
for (int i = 0; i < N; ++i)
{
for (int j = i + 1; j < N; ++j)
{
real dx = points[i].x - points[j].x;
real dy = points[i].y - points[j].y;
real d2 = dx * dx + dy * dy;
if ( d2 < limit )
{
count++;
}
}
}
double time = clk.time();
printf("%d\n", count);
printf("TIME: %lf\n", time);
return 0;
}
IMO, It has to do with optimization flags. I checked your program in g++ linux-64 machine. Without any optimization, it give the same result as you told which 50%
less.
With keeping the maximum optimization turned ON (i.e. -O4
). Both versions are same. Turn on the optimization and check.
Try not passing by reference. Your class is small enough that the overhead of passing it by reference (yes there is overhead if the compiler doesn't optimize it out), might be higher than just copying the class. So this...
Float operator+ (const Float& rhs) const
{
return Float( value + rhs.value );
}
becomes something like this...
Float operator+ (Float rhs) const
{
rhs.value+=value;
return rhs;
}
which avoids a temporary object and may avoid some indirection of a pointer dereference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With