I need a function to generate random integers. (assume Java long
type for now, but this will be extended to BigInteger
or BitSet
later.)
The tricky part is there is a parameter P that specifies the (independent) probability of any bit in the result being 1.
If P = 0.5 then we can just use the standard random number generator. Some other values of P are also easy to implement. Here's an incomplete example:
Random random = new Random();
// ...
long nextLong(float p) {
if (p == 0.0f) return 0L;
else if (p == 1.0f) return -1L;
else if (p == 0.5f) return random.nextLong();
else if (p == 0.25f) return nextLong(0.5f) & nextLong(0.5f);
else if (p == 0.75f) return nextLong(0.5f) | nextLong(0.5f);
else if (p == 0.375f) return nextLong(0.5f) & nextLong(0.75f); // etc
else {
// What goes here??
String message = String.format("P=%f not implemented yet!", p);
throw new IllegalArgumentException(message);
}
}
Is there a way to generalise this for any value of P between 0.0 and 1.0?
First a little ugly math that you're already using in your code.
Define x and y are bits with probability of being 1 of X = p(x=1), Y = p(y=1) respectively. Then we have that
p( x & y = 1) = X Y
p( x | y = 1) = 1 - (1-X) (1-Y)
p( x ^ y = 1) = X (1 - Y) + Y (1 - X)
Now if we let Y = 1/2 we get
P( x & y ) = X/2
P( x | y ) = (X+1)/2
Now set the RHS to the probability we want and we have two cases that we can solve for X
X = 2 p // if we use &
X = 2 p - 1 // if we use |
Next we assume we can use this again to obtain X in terms of another variable Z... And then we keep iterating until we've done "enough".
Thats a bit unclear but consider p = 0.375
0.375 * 2 = 0.75 < 1.0 so our first operation is &
0.75 * 2 = 1.5 > 1.0 so our second operation is |
0.5 is something we know so we stop.
Thus we can get a variable with p=0.375 by X1 & (X2 | X3 )
The problem is that for most variables this will not terminate. e.g.
0.333 *2 = 0.666 < 1.0 so our first operation is &
0.666 *2 = 1.333 > 1.0 so our second operation is |
0.333 *2 = 0.666 < 1.0 so our third operation is &
etc...
so p=0.333 can be generated by
X1 & ( X2 | (X3 & (X4 | ( ... ) ) ) )
Now I suspect that taking enough terms in the series will give you enough accuracy, and this can be written as a recursive function. However there might be a better way that that too... I think the order of the operations is related to the binary representation of p, I'm just not sure exactly how... and dont have time to think about it deeper.
Anyway heres some untested C++ code that does this. You should be able to javaify it easily.
uint bitsWithProbability( float p )
{
return bitsWithProbabilityHelper( p, 0.001, 0, 10 );
}
uint bitsWithProbabilityHelper( float p, float tol, int cur_depth, int max_depth )
{
uint X = randbits();
if( cur_depth >= max_depth) return X;
if( p<0.5-tol)
{
return X & bitsWithProbabilityHelper( 2*p, 0.001, cur_depth+1, max_depth );
}
if(p>0.5+tol)
{
return X | bitsWithProbabilityHelper( 2*p-1, 0.001, cur_depth+1, max_depth );
}
return X;
}
Distribute proportional number of bits throughuot the number. Pseudocode:
long generateNumber( double probability ){
int bitCount = 64 * probability;
byte[] data = new byte[64]; // 0-filled
long indexes = getRandomLong();
for 0 to bitCount-1 {
do {
// distribute this bit to some postition with 0.
int index = indexes & 64;
indexes >> 6;
if( indexes == 0 ) indexes = getRandomLong();
} while ( data[index] == 0 );
data[index] = 1;
}
return bytesToLong( data );
}
I hope you get what I mean. Perhaps the byte[]
could be replaced with a long
and bit operations to make it faster.
Here's how I solved it in the end.
This was partly inspired by Ondra Žižka's answer.
The benefit is that it reduces the number of calls to Random.nextLong()
to 8 calls per 64 bits of output.
For comparison, rolling for each individual bit would require 64 calls. Bitwise AND/OR uses between 2 and 32 calls depending on the value of P
Of course calculating binomial probabilities is just as expensive, so those go in another lookup table.
It's a lot of code, but it's paying off in terms of performance.
Update - merged this with the bitwise AND/OR solution. It now uses that method if it guesses it will be more efficient (in terms of calls to Random.next()
.)
Use a random generator that generates a uniform float number r between 0 and 1. If r>p then set the bit to 0, otherwise set it to 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With