Why does this code generates uniformly distributed numbers? I have some difficulties in understanding it. Could someone explain? Thanks.
int RandomUniform(int n) {
int top = ((((RAND_MAX - n) + 1) / n) * n - 1) + n;
int r;
do {
r = rand();
} while (r > top);
return (r % n);
}
update: I do understand why rand()%n doesn't give you a uniformly distributed sequence. My question is why the
top = ((((RAND_MAX - n) + 1) / n) * n - 1) + n;
What's the concern here? I think a simple top = RAND_MAX / n * n would do.
The inversion method relies on the principle that continuous cumulative distribution functions (cdfs) range uniformly over the open interval (0,1). If u is a uniform random number on (0,1), then x = F - 1 ( u ) generates a random number x from any continuous distribution with the specified cdf F .
The Uniform Random Number block generates uniformly distributed random numbers over a specifiable interval with a specifiable starting seed. The seed is reset each time a simulation starts. The generated sequence is repeatable and can be produced by any Uniform Random Number block with the same seed and parameters.
If you want to use RAND to generate a random number but don't want the numbers to change every time the cell is calculated, you can enter =RAND() in the formula bar, and then press F9 to change the formula to a random number. The formula will calculate and leave you with just a value.
The function assumes that rand()
is uniformly distributed; whether or not that is a valid assumption depends on the implementation of rand()
.
Given a uniform rand()
, we can get a random number in the range [0,n)
by calculating rand()%n
. However, in general, this won't be quite uniform. For example, suppose n
is 3 and RAND_MAX
is 7:
rand() 0 1 2 3 4 5 6 7
rand() % n 0 1 2 0 1 2 0 1
We can see that 0 and 1 come up with a probability of 3/8, while 2 only comes up with a probability of 2/8: the distribution is not uniform.
Your code discards any value of rand()
greater or equal to the largest multiple of n
that it can generate. Now each value has an equal probability:
rand() 0 1 2 3 4 5 6 7
rand() % n 0 1 2 0 1 2 X X
So 0,1 and 2 all come up with a probability of 1/3, as long as we are not so unlucky that the loop never terminates.
Regarding your update:
I think a simple top = RAND_MAX / n * n would do.
If RAND_MAX
were an exclusive bound (one more than the actual maximum), then that would be correct. Since it's an inclusive bound, we need to add one to get the exclusive bound; and since the following logic compares with >
against an inclusive bound, then subtract one again after the calculation:
int top = ((RAND_MAX + 1) / n) * n - 1;
However, if RAND_MAX
were equal to INT_MAX
, then the calculation would overflow; to avoid that, subtract n
at the beginning of the calculation, and add it again at the end:
int top = (((RAND_MAX - n) + 1) / n) * n - 1 + n;
The underlying problem is this: suppose you have a random number generator my_rand()
that produces value from 0 to 6, inclusive, and you want to generate values from 0 to 5, inclusive; if you run your generator and return my_rand() % 6
, you won't get a uniform distribution. When my_rand()
returns 0, you get 0; when it returns 1, you get 1, etc. until my_rand()
returns 6; in that case my_rand() % 6
is 0. So overall, my_rand() % 6
will return 0 twice as often as any other value. The way to fix this is to not use values greater than 5, that is, instead of my_rand() % 5
you write a loop and discard values from my_rand()
that are too large. That's essentially what the code in the question is doing. I haven't traced it through, but the usual implementation is to compute the largest multiple of n
that is less than or equal to RAND_MAX
, and whenever rand()
returns a value that's greater than that multiple, go back and get a new value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With