Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do processors actually calculate multiplication by a zero or one? Why?

The Short Version

In the following line:

aData[i] = aData[i] + ( aOn * sin( i ) );

If aOn is 0 or 1, does the processor actually perform the multiplication, or does it conditionally work out the result (0 for 0, other-value for 1)?

The Long Version

I'm looking into algorithm performance consistency, which partly involves a look into the effect of Branch Prediction.

The hypothesis is that this code:

for ( i = 0; i < iNumSamples; i++ )
    aData[i] = aData[i] + ( aOn * sin( i ) );

will provide more stable performance than this code (where branch prediction may destabilise performance):

for ( i = 0; i < iNumSamples; i++ )
{
    if ( aOn )
        aData[i] = aData[i] + sin( i );
}

with aOn being either 0 or 1, and it can toggle during the loop execution by another thread.

The actual conditional calculation (+ sin( i ) in the example above) involves more processing and the if condition must be within the loop (there are multitude of conditions, not just one like in the example above; also, changes to aOn should have effect immediately and not per loop).

Ignoring performance consistency, the performance tradeoff between the two options is in the time it takes to execute the if statement and that of a multiplication.

Regardless, it is easy to spot that if a processor would not perform the actual multiplication for values like 1 and 0, the first option could be a win-win solution (no branch prediction, better performance).

like image 529
Izhaki Avatar asked Jul 08 '13 22:07

Izhaki


People also ask

How does a CPU do multiplication?

Binary long multiplication This method is mathematically correct and has the advantage that a small CPU may perform the multiplication by using the shift and add features of its arithmetic logic unit rather than a specialized circuit. The method is slow, however, as it involves many intermediate additions.

How many clock cycles does a multiplication take?

For most modern processors the standard integer operations take just one clock cycle, excepting just multiplication and division (if available). Multiplication typically takes around 6 cycles, and division often 30 to 60 cycles.

Is division or multiplication faster for computers?

High-performance code often favors multiplication over division because multiplication is faster on most modern processors.


1 Answers

Processors perform regular multiplication with 0s and 1s.

Reason is, that if the processor would check for 0 and 1 before each calculation, the introduction of the condition will take more cycles. While you would gain performance for 0 and 1 multipliers, you will lose performance for any other values (which are much more probable).

A simple program can prove this:

#include <iostream>
#include "cycle.h"
#include "time.h"

void Loop( float aCoefficient )
{
    float iSum = 0.0f;

    clock_t iStart, iEnd;

    iStart = clock();
    for ( int i = 0; i < 100000000; i++ )
    {
        iSum += aCoefficient * rand();
    }
    iEnd = clock();
    printf("Coefficient: %f: %li clock ticks\n", aCoefficient, iEnd - iStart );
}

int main(int argc, const char * argv[])
{
    Loop( 0.0f );
    Loop( 1.0f );
    Loop( 0.25f );

    return 0;
}

For which the output is:

Coefficient: 0.000000: 1380620 clock ticks
Coefficient: 1.000000: 1375345 clock ticks
Coefficient: 0.250000: 1374483 clock ticks 
like image 157
Izhaki Avatar answered Oct 12 '22 04:10

Izhaki