I'm stuck there trying to figure out how to convert the last two "if" statements of the following code to a branchless state.
int u, x, y;
x = rand() % 100 - 50;
y = rand() % 100 - 50;
u = rand() % 4;
if ( y > x) u = 5;
if (-y > x) u = 4;
Or, in case the above turns out to be too difficult, you can consider them as:
if (x > 0) u = 5;
if (y > 0) u = 4;
I think that what gets me is the fact that those don't have an else
catcher. If it was the case I could have probably adapted a variation of a branchless abs
(or max
/min
) function.
The rand()
functions you see aren't part of the real code. I added them like this just to hint at the expected ranges that the variables x
, y
and u
can possibly have at the time the two branches happen.
Assembly machine code is allowed for the purpose.
EDIT:
After a bit of braingrinding I managed to put together a working branchless version:
int u, x, y;
x = rand() % 100 - 50;
y = rand() % 100 - 50;
u = rand() % 4;
u += (4-u)*((unsigned int)(x+y) >> 31);
u += (5-u)*((unsigned int)(x-y) >> 31);
Unfortunately, due to the integer arithmetic involved, the original version with if statements turns out to be faster by a 30% range.
Compiler knows where the party is at.
[All: this answer was written with the assumption that the calls on rand() were part of the problem. I offer improvement below under that assumption. OP belatedly clarifies he only used rand to tell us ranges (and presumably distribution) of the values of x and y. Unclear if he meant for the value for u, too. Anyway, enjoy my improved answer to the problem he didn't really pose].
I think you'd be better off recoding this as:
int u, x, y;
x = rand() % 100 - 50;
y = rand() % 100 - 50;
if ( y > x) u = 5;
else if (-y > x) u = 4;
else u = rand() % 4;
This calls the last rand only 1/4 as often as OP's original code. Since I assume rand (and the divides) are much more expensive than compare-and-branch, this would be a significant savings.
If your rand generator produces a lot of truly random bits (e.g. 16) on each call as it should, you can call it just once (I've assumed rand is more expensive than divide, YMMV):
int u, x, y, t;
t = rand() ;
u = t % 4;
t = t >> 2;
x = t % 100 - 50;
y = ( t / 100 ) %100 - 50;
if ( y > x) u = 5;
else if (-y > x) u = 4;
I think that the rand function in the MS C library is not good enough for this if you want really random values. I had to code my own; turned out faster anyway.
You might also get rid of the divide, by using multiplication by a reciprocal (untested):
int u, x, y;
unsigned int t;
unsigned long t2;
t = rand() ;
u = t % 4;
{ // Compute value of x * 2^32 in a long by multiplying.
// The (unsigned int) term below should be folded into a single constant at compile time.
// The remaining multiply can be done by one machine instruction
// (typically 32bits * 32bits --> 64bits) widely found in processors.
// The "4" has the same effect as the t = t >> 2 in the previous version
t2 = ( t * ((unsigned int)1./(4.*100.)*(1<<32));
}
x = (t2>>32)-50; // take the upper word (if compiler won't, do this in assembler)
{ // compute y from the fractional remainder of the above multiply,
// which is sitting in the lower 32 bits of the t2 product
y = ( t2 mod (1<<32) ) * (unsigned int)(100.*(1<<32));
}
if ( y > x) u = 5;
else if (-y > x) u = 4;
If your compiler won't produce the "right" instructions, it should be straightforward to write assembly code to do this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With