Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C/C++: Multiply, or bitshift then divide? [duplicate]

Where it's possible to do so, I'm wondering if it's faster to replace a single multiplication with a bitshift followed by an integer division. Say I've got an int k and I want to multiply it by 2.25.

What's faster?

int k = 5;
k *= 2.25;
std::cout << k << std::endl;

or

int k = 5;
k = (k<<1) + (k/4);
std::cout << k << std::endl;

Output

11
11

Both give the same result, you can check this full example.

like image 665
MQDuck Avatar asked Jun 01 '14 21:06

MQDuck


4 Answers

The first attempt

I defined functions regularmultiply() and bitwisemultiply() as follows:

int regularmultiply(int j)
{
    return j * 2.25;
}

int bitwisemultiply(int k)
{
    return (k << 1) + (k >> 2);
}

Upon doing profiling with Instruments (in XCode on a 2009 Macbook OS X 10.9.2), it seemed that bitwisemultiply executed about 2x faster than regularmultiply.

enter image description here

The assembly code output seemed to confirm this, with bitwisemultiply spending most of its time on register shuffling and function returns, while regularmultiply spent most of its time on the multiplying.

regularmultiply:

enter image description here

bitwisemultiply:

enter image description here

But the length of my trials was too short.

The second attempt

Next, I tried executing both functions with 10 million multiplications, and this time putting the loops in the functions so that all the function entry and leaving wouldn't obscure the numbers. And this time, the results were that each method took about 52 milliseconds of time. So at least for a relatively large but not gigantic number of calculations, the two functions take about the same time. This surprised me, so I decided to calculate for longer and with larger numbers.

The third attempt

This time, I only multiplied 100 million through 500 million by 2.25, but the bitwisemultiply actually came out slightly slower than the regularmultiply.

The final attempt

Finally, I switched the order of the two functions, just to see if the growing CPU graph in Instruments was perhaps slowing the second function down. But still, the regularmultiply performed slightly better:

enter image description here

Here is what the final program looked like:

#include <stdio.h>

int main(void)
{
    void regularmultiplyloop(int j);
    void bitwisemultiplyloop(int k);

    int i, j, k;

    j = k = 4;
    bitwisemultiplyloop(k);
    regularmultiplyloop(j);

    return 0;
}

void regularmultiplyloop(int j)
{
    for(int m = 0; m < 10; m++)
    {
        for(int i = 100000000; i < 500000000; i++)
        {
            j = i;
            j *= 2.25;
        }
        printf("j: %d\n", j);
    }
}

void bitwisemultiplyloop(int k)
{
    for(int m = 0; m < 10; m++)
    {
        for(int i = 100000000; i < 500000000; i++)
        {
            k = i;
            k = (k << 1) + (k >> 2);
        }
        printf("k: %d\n", k);
    }
}

Conclusion

So what can we say about all this? One thing we can say for certain is that optimizing compilers are better than most people. And furthermore, those optimizations show themselves even more when there are a lot of computations, which is the only time you'd really want to optimize anyway. So unless you're coding your optimizations in assembly, changing multiplication to bit shifting probably won't help much.

It's always good to think about efficiency in your applications, but the gains of micro-efficiency are usually not enough to warrant making your code less readable.

like image 148
Chris Middleton Avatar answered Oct 21 '22 12:10

Chris Middleton


That depends heavily on the CPU architecture: Floating point arithmetic, including multiplications, has become quite cheap on many CPUs. But the necessary float->int conversion can bite you: on POWER-CPUs, for instance, the regular multiplication will crawl along due to the pipeline flushes that are generated when a value is moved from the floating point unit to the integer unit.

On some CPUs (including mine, which is an AMD CPU), this version is actually the fastest:

k *= 9;
k >>= 2;

because these CPUs can do a 64 bit integer multiplication in a single cycle. Other CPUs are definitely slower with my version than with your bitshift version, because their integer multiplication is not as heavily optimized. Most CPUs aren't as bad on multiplications as they used to be, but a multiplication can still take more than four cycles.

So, if you know which CPU your program will run on, measure which is fastest. If you don't know, your bitshift version won't perform badly on any architecture (unlike both the regular version and mine), which makes it a really safe bet.

like image 22
cmaster - reinstate monica Avatar answered Oct 21 '22 11:10

cmaster - reinstate monica


Indeed it depends on a variety of factors. So I have just checked it by running and measuring time. So the string we are interested in takes only a few instructions of CPU which is very fast so I have wrapped it into the cycle - multiplied the execution time of one code by a big number, and I got the k *= 2.25; is about in 1.5 times slower than k = (k<<1) + (k/4);. Here is my two codes to comapre:

prog1:

#include <iostream>
using namespace std;

int main() {

int k = 5;
for (unsigned long i = 0; i <= 0x2fffffff;i++)
 k = (k<<1) + (k/4);
cout << k << endl;

return 0;
}

prog 2:

#include <iostream>
using namespace std;

int main() {

int k = 5;
for (unsigned long i = 0; i <= 0x2fffffff;i++)
 k *= 2.25;
cout << k << endl;

return 0;
}

Prog1 takes 8 secs and Prog2 takes 14 secs. So by running this test with you architecture and compiler you can get the result which is correct to your particular environment.

like image 28
Ruslan Gerasimov Avatar answered Oct 21 '22 10:10

Ruslan Gerasimov


It highly depends on what hardware are you using. On modern hardware floating point multiplications may run way faster than integer ones, so you might want to change the entire algorithm and start using doubles instead of integers. If you're writing for modern hardware and you have a lot of operations like multiplying by 2.25, I'd suggest using double rather than integers, if nothing else prevents you from doing that.

And be data driven - measure performance, because it's affected by compiler, hardware and your way of implementing your algorithm.

like image 37
Alex Avatar answered Oct 21 '22 12:10

Alex