How to optimize a simple numeric type wrapper class in C++?

Question

I am trying to implement a fixed-point class in C++, but I face problems with performance. I have reduced the problem to a simple wrapper of the float type and it is still slow. My question is - why is the compiler unable optimize it fully?

The 'float' version is 50% faster than 'Float'. Why?!

(I use Visual C++ 2008, all possible compiler's options tested, Release configuration of course).

See the code below:

#include <cstdio>
#include <cstdlib>
#include "Clock.h"      // just for measuring time

#define real Float      // Option 1
//#define real float        // Option 2

struct Float
{
private:
    float value;

public:
    Float(float value) : value(value) {}
    operator float() { return value; }

    Float& operator=(const Float& rhs)
    {
        value = rhs.value;
        return *this;
    }

    Float operator+ (const Float& rhs) const
    {
        return Float( value + rhs.value );
    }

    Float operator- (const Float& rhs) const
    {
        return Float( value - rhs.value );
    }

    Float operator* (const Float& rhs) const
    {
        return Float( value * rhs.value );
    }

    bool operator< (const Float& rhs) const
    {
        return value < rhs.value;
    }
};

struct Point
{
    Point() : x(0), y(0) {}
    Point(real x, real y) : x(x), y(y) {}

    real x;
    real y;
};

int main()
{
    // Generate data
    const int N = 30000;
    Point points[N];
    for (int i = 0; i < N; ++i)
    {
        points[i].x = (real)(640.0f * rand() / RAND_MAX);
        points[i].y = (real)(640.0f * rand() / RAND_MAX);
    }

    real limit( 20 * 20 );

    // Check how many pairs of points are closer than 20
    Clock clk;

    int count = 0;
    for (int i = 0; i < N; ++i)
    {
        for (int j = i + 1; j < N; ++j)
        {
            real dx = points[i].x - points[j].x;
            real dy = points[i].y - points[j].y;
            real d2 = dx * dx + dy * dy;
            if ( d2 < limit )
            {
                count++;
            }
        }
    }

    double time = clk.time();

    printf("%d
", count);
    printf("TIME: %lf
", time);

    return 0;
}

iammilind · Accepted Answer

IMO, It has to do with optimization flags. I checked your program in g++ linux-64 machine. Without any optimization, it give the same result as you told which 50% less.

With keeping the maximum optimization turned ON (i.e. -O4). Both versions are same. Turn on the optimization and check.

Skyler Saleh · Answer

Try not passing by reference. Your class is small enough that the overhead of passing it by reference (yes there is overhead if the compiler doesn't optimize it out), might be higher than just copying the class. So this...

Float operator+ (const Float& rhs) const
{
   return Float( value + rhs.value );
}

becomes something like this...

Float operator+ (Float rhs) const
{
   rhs.value+=value;
   return rhs;
}

which avoids a temporary object and may avoid some indirection of a pointer dereference.

How to optimize a simple numeric type wrapper class in C++?

Tags:

c++

performance

optimization

fixed-point

Michal Czardybon

2 Answers

iammilind

Skyler Saleh

Recent Activity

Donate For Us

How to optimize a simple numeric type wrapper class in C++?

Tags:

c++

performance

optimization

fixed-point

Michal Czardybon

2 Answers

iammilind

Skyler Saleh

Related questions

Recent Activity

Donate For Us