I have a stream of (uniform) random bits from which I'd like to generate random integers uniformly in the range [0,n] without wasting bits. (I'm considering bits wasted which are in excess of floor(log_2(n))+1, on the assumption that it's always possible to use no more than that.) E.g., if n = 5, then the algorithm I'm looking for should use no more than three bits. How can this be done?
Let me talk about random integer generating algorithms that are "optimal" in terms of the number of random bits it uses on average. In the rest of this post, we will assume we have a "true" random generator that can produce unbiased and independent random bits.
In 1976, D. E. Knuth and A. C. Yao showed that any algorithm that produces random integers with a given probability, using only random bits, can be represented as a binary tree, where random bits indicate which way to traverse the tree and each leaf (endpoint) corresponds to an outcome. (Knuth and Yao, "The complexity of nonuniform random number generation", in Algorithms and Complexity, 1976.) Knuth and Yao showed that any optimal binary tree algorithm for generating integers in [0, n)
uniformly, will need at least log2(n)
and at most log2(n) + 2
bits on average. (Thus, even an optimal algorithm has a chance of "wasting" bits.) See below for examples of optimal algorithms.
However, any optimal integer generator that is also unbiased will, in general, run forever in the worst case, as also shown by Knuth and Yao. Going back to the binary tree, each one of the n outcomes labels leaves in the binary tree so that each integer in [0, n) can occur with probability 1/n. But if 1/n has a non-terminating binary expansion (which will be the case if n is not a power of 2), this binary tree will necessarily either—
And in either case, the algorithm will run forever in the worst case, even if it uses very few random bits on average. (On the other hand, when n is a power of 2, the optimal binary tree will have no rejection nodes and require exactly n bits before returning an outcome, so that no bits will be "wasted".) The Fast Dice Roller is an example of an algorithm that uses "rejection" events to ensure it's unbiased; see the comment in the code below.
Thus, in general, a random integer generator can be either unbiased or constant-time (or even neither), but not both. And the binary tree concept shows that there is no way in general to "fix" the worst case of an indefinite running time without introducing bias. For instance, modulo reductions (e.g., rand() % n
) are equivalent to a binary tree in which rejection leaves are replaced with labeled outcomes — but since there are more possible outcomes than rejection leaves, only some of the outcomes can take the place of the rejection leaves, introducing bias. The same kind of binary tree — and the same kind of bias — results if you stop rejecting after a set number of iterations. (However, this bias may be negligible depending on the application. There are also security aspects to random integer generation, which are too complicated to discuss in this answer.)
There are many examples of optimal algorithms in the sense given earlier. One of them is the Fast Dice Roller by J. Lumbroso (2013) (implemented below), and perhaps other examples are the algorithm given in one of the other answers here and the algorithm given in the Math Forum in 2004. On the other hand, all the algorithms surveyed by M. O'Neill are not optimal, since they rely on generating blocks of random bits at a time. See also my note on integer generating algorithms.
The following is a JavaScript implementation of the Fast Dice Roller. Note that it uses rejection events and a loop to ensure it's unbiased. nextBit()
is a method that produces an independent unbiased random bit (e.g., Math.random()<0.5 ? 1 : 0
, which isn't necessarily efficient in terms of random bits ultimately relied on in JavaScript).
function randomInt(minInclusive, maxExclusive) {
var maxInclusive = (maxExclusive - minInclusive) - 1
var x = 1
var y = 0
while(true) {
x = x * 2
var randomBit = nextBit()
y = y * 2 + randomBit
if(x > maxInclusive) {
if (y <= maxInclusive) { return y + minInclusive }
// Rejection
x = x - maxInclusive - 1
y = y - maxInclusive - 1
}
}
}
The following version returns a BigInt, an arbitrary-precision integer supported in recent versions of JavaScript:
function randomInt(minInclusive, maxExclusive) {
minInclusive=BigInt(minInclusive)
maxExclusive=BigInt(maxExclusive)
var maxInclusive = (maxExclusive - minInclusive) - BigInt(1)
var x = BigInt(1)
var y = BigInt(0)
while(true) {
x = x * BigInt(2)
var randomBit = BigInt(Math.random()<0.5 ? 1 : 0)
y = y * BigInt(2) + randomBit
if(x > maxInclusive) {
if (y <= maxInclusive) { return y + minInclusive }
// Rejection
x = x - maxInclusive - BigInt(1)
y = y - maxInclusive - BigInt(1)
}
}
}
Recall that "optimal" integer generators, such as the Fast Dice Roller above, use on average at least log2(n)
bits (the lower bound), or come within 2 bits of this lower bound on average. There are various techniques that can be used to bring an algorithm (even a less than optimal one) closer to this theoretical lower bound, including batching and randomness extraction. These are discussed in:
This is equivalent to find a two-way function between two set of different (finite) cardinality. It is impossible.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With