Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a performance benefit/penalty when adding int-values to doubles?

Tags:

c++

Given a vector addition:

  NPNumber NPNumber::plus(const double o) const {
    vector<double> c;
    for (double a : values)
       c.push_back(a + o);

    return NPNumber(width, c);
  }

Where NPNumber contains a vector of doubles (field values), when I only add a single integer, instead of another NPNumber, is there a performance benefit or penalty compared to converting that integer and using the function above?

i.e., is this faster/slower on any architecture:

  NPNumber NPNumber::plus(const int i) const {
    vector<double> c;
    for (double a : values)
       c.push_back(a + i);

    return NPNumber(width, c);
  }
like image 475
choeger Avatar asked Nov 25 '13 11:11

choeger


2 Answers

It's strongly compiler depended and you should measure it in your code. A quick and simple observation in my machine (32-bit MinGW/gcc 4.9) shows the + itself is equal for both cases, however the integral operation seems a little better.

Adding two double:

!        double d = 0.2;
fldl   0x409070
fstpl  -0x10(%ebp)

!        double y = 1.0;
fld1   
fstpl  -0x18(%ebp)

!        double z = d + y;
fldl   -0x10(%ebp)
faddl  -0x18(%ebp)
fstpl  -0x20(%ebp)

Adding two int:

!        double d = 0.2;
fldl   0x409070
fstpl  -0x28(%ebp)

!        int y = 1;
movl   $0x1,-0x2c(%ebp)

!        double z = d + y;
fildl  -0x2c(%ebp)
faddl  -0x28(%ebp)
fstpl  -0x38(%ebp)

Both use faddl to add, but compiler uses better instruction to load the integer before adding. So, there is no penalty to add an integer to a double (and it may be even better rather than adding two doubles).

In your application, profiling is the best way to find out that which one is better.

like image 166
masoud Avatar answered Sep 28 '22 10:09

masoud


Another thing to consider is compiler optimizations.

Floating point units tend to have their own registers. These in some cases may even have greater precision than typical operands (for instance, 80-bit temporary reals;) however, see the comments as this can vary a lot.

I would expect it is cheaper to operate on values already loaded into the FPU, and the compiler should know this. As such, it may hoist the promotion of your constant value out of the loop and keep the value loaded in the FPU, in which case the difference would be negligible on large vectors.

In any event, I would hope that if the int to double conversion is expensive on a given platform, a respectable compiler would not perform it redundantly. As such, what I'd probably do is make it a template method so you can accept whatever type & precision the constant data naturally comes from; this permits the compiler to "do the right thing" for the particular platform in any given situation.

With that said, compilers do vary quite a bit in their optimization strategies and platforms vary in their features & performance characteristics, so if you're trying to squeeze out every last microsecond, you should do profiling for your platform(s) of interest.

like image 20
Kevin Avatar answered Sep 28 '22 09:09

Kevin