Is there a way to convert uniformly distributed random numbers of one range to uniformly distributed random numbers of another range frugally?
Let me explain what I mean by "frugally".
The typical approach to generate random number within given range (e.g. r ∈ [0..10) ) is to take some fixed random bits, let's say 31, which result non-negative random number less than 2147483648. Then make sure that the value is less than 2147483640 (because 2147483648 is not divisible by 10, and hence may lead to uneven distribution). If the value is greater or equal to 2147483640, throw it away and try again (get next 31 random bits and so on). If value if less than 2147483640, then just return the remainder of division by 10. This approach consumes at least 31 bit per decimal digit. Since theoretical limit is log2(10) = 3.321928..., it is quite wasteful.
We can improve this, if we use 4 bits instead if 31. In this case we will consume 4 × 1.6 = 6.4 bits per decimal digit. This is more frugal, but still far from the ideal.
public int nextDit() {
int result;
do {
result = next4Bits();
} while (result >= 10);
return result;
}
We can try to generate 3 decimal digits at once. Since 1024 is quite close to 1000, the probability that raw source random number will be rejected is less than in previous case. Once we generated 3 decimal digits, we return 1 digit and reserve the rest 2 digits.
Something like below
private int _decDigits = 0;
private int _decCount = 0;
public int nextDit() {
if (_decCount > 0) {
// take numbers from the reserve
int result = _decDigits % 10;
_decDigits /= 10;
_decCount -= 1;
return result;
} else {
int result;
do {
result = next10Bits();
} while (result >= 1000);
// reserve 2 decimal digits
_decCount = 2;
_decDigits = result % 100;
result /= 100;
return result;
}
}
This approach is much more frugal: it consumes 10 × 1.024 / 3 = 3.41(3) bits per decimal digit.
We can even go farther if we try to reuse the numbers, which we previously have been throwing away. The random number r ∈ [0, 1024) falls into one of the 3 ranges: [0, 1000), [1000, 1020), [1020, 1024).
If it falls into [0, 1000), we do as we did before, reserve 2 decimal digits (in decimal digit reserve) and return 1 decimal digit.
If it falls into [1000, 1020), we subtract 1000 converting to the range [0, 20). Then we get 1 bit by dividing it by 10 and 1 decimal digit by getting remainder of division by 10. We put the bit to the binary digit reserve and return the decimal digit.
If it falls into [1020, 1024), we subtract 1020 converting to the range [0, 4). Here we get just 2 bits, which we put to the binary digits reserve.
// decimal digit reserve
private int _decDigits = 0;
private int _decCount = 0;
// binary digit reserve
private int _binDigits = 0;
private int _binCount = 0;
private int nextBits(int bits, int n) {
for (int i = 0; i < n; i += 1) {
bits = (bits << 1) + _bitRandomDevice.nextBit();
}
return bits;
}
private int next10Bits() {
// take bits from the binary reserve first, then from _bitRandomDevice
int result;
if (_binCount >= 10) {
result = _binDigits >> (_binCount - 10);
_binDigits = _binDigits & (1 << (_binCount - 10) - 1);
_binCount -= 10;
} else {
result = nextBits(_binDigits, 10 - _binCount);
_binCount = 0;
_binDigits = 0;
}
return result;
}
public int nextDit() {
if (_decCount > 0) {
// take numbers from the decimal reserve
int result = _decDigits % 10;
_decDigits /= 10;
_decCount -= 1;
return result;
} else {
int result;
while (true) {
result = next10Bits();
if (result < 1000) {
assert result >= 0 && result < 1000;
// reserve 2 decimal digits
_decCount = 2;
_decDigits = result % 100;
result /= 100;
// return 1 decimal digit
return result;
} else if (result < 1020) {
result -= 1000;
assert result >= 0 && result < 20;
// reserve 1 binary digit
_binCount += 1;
_binDigits = (_binDigits << 1) + (result / 10);
// return 1 decimal digit
return result % 10;
} else {
result -= 1020;
assert result >= 0 && result < 4;
// reserve 2 binary digits
_binCount += 2;
_binDigits = (_binDigits << 2) + result;
}
}
}
}
This approach consumes about 3.38... bits per decimal digit. This is the most frugal approach I can find, but it still wastes/loses some information from the source of randomness.
Thus, my question is: Is there any universal approach/algorithm that converts uniformly distributed random numbers of one arbitrary range [0, s) (later called source numbers) to uniformly distributed random numbers of another arbitrary range [0, t) (later called target numbers), consuming only logs(t) + C source numbers per target number? where C is some constant. If there is no such approach, why? What prevents from reaching the ideal limit?
The purpose of being frugal is to reduce number of calls to RNG. This could be especially worth to do when we work with True RNG, which often has limited throughput.
As for "frugality optimizations", they are based on following assumptions:
The Uniform Random Number block generates uniformly distributed random numbers over an interval that you specify. To generate normally distributed random numbers, use the Random Number block. Both blocks use the Normal (Gaussian) random number generator ( 'v4' : legacy MATLAB® 4.0 generator of the rng function).
It's just a random number where each possible number is just as likely as any other possible number. A fair die is a uniform random number generator for numbers between 1 and 6 inclusive.
Abstract. A frequently used technique for generating non-uniform random numbers is to combine uniform random numbers. For example, the sum of two uniform random numbers yields a triangular distribution. The difference yields another which will be statistically independent.
To generate random numbers from the Uniform distribution we will use random.uniform () method of random module. In uniform distribution samples are uniformly distributed over the half-open interval [low, high) it includes low but excludes high interval. Attention geek!
The inversion method relies on the principle that continuous cumulative distribution functions (cdfs) range uniformly over the open interval (0,1). If is a uniform random number on (0,1), then generates a random number from any continuous distribution with the specified cdf F. Step 2. Generate random numbers from the Weibull distribution.
This is useful for distributions when it is possible to compute the inverse cumulative distribution function, but there is no support for sampling from the distribution directly. Step 1. Generate random numbers from the standard uniform distribution.
Let us show that the random variable Y, where Y = 28 X − 11, is indeed uniformly distributed in the interval [ − 11, 17]. Note that Y is just a scaling of X followed by a shift. That should preserve uniform distribution.
Your goal is ultimately to roll a k-sided die given only a p-sided die, without wasting randomness.
In this sense, by Lemma 3 in "Simulating a dice with a dice" by B. Kloeckner, this waste is inevitable unless "every prime number dividing k also divides p". Thus, for example, if p is a power of 2 (and any block of random bits is the same as rolling a die with a power of 2 number of faces) and k has prime factors other than 2, the best you can do is get arbitrarily close to no waste of randomness.
Also, besides batching of bits to reduce "bit waste" (see also the Math Forum), there is also the technique of randomness extraction, discussed in Devroye and Gravel 2015-2020 and in my Note on Randomness Extraction.
See also the question: How to generate a random integer in the range [0,n] from a stream of random bits without wasting bits?, especially my answer there.
Keep adding more digits. Here's some Python to compute expected yields (this is slightly worse for a particular value of n
than your approach because it doesn't save leftover bits, but it's good enough to make my point):
import math
def expected_digits(n, b):
total = 0
p = 1
while n >= b:
p *= 1 - (n % b) / n
total += p
n //= b
return total
def expected_yield(k):
return expected_digits(2 ** k, 10) / k
print(expected_yield(10))
print(expected_yield(30))
print(expected_yield(100000))
print(math.log10(2))
The output is
0.294921875
0.2952809327592452
0.301018918814536
0.3010299956639812
and as you can see, 100000
binary digits (second to last line) gets quite close to the Shannon limit (last line).
In theoretical terms, we're applying an arithmetic decoder where all output numbers have equal probability to an infinite stream of bits (interpreted as a random number between 0 and 1). The asymptotic efficiency approaches perfection, but the more samples you take, the heavier the arithmetic gets. That tends to be the trade-off.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With