Why does this code generates uniformly distributed numbers? I have some difficulties in understanding it. Could someone explain? Thanks. <pre class="prettyprint"><code>int RandomUniform(int n) { int top = ((((RAND_MAX - n) + 1) / n) * n - 1) + n; int r; do { r = rand(); } while (r > top); return (r % n); } </code></pre> update: I do understand why rand()%n doesn't give you a uniformly distributed sequence. My question is why the <pre class="prettyprint"><code>top = ((((RAND_MAX - n) + 1) / n) * n - 1) + n; </code></pre> What's the concern here? I think a simple top = RAND_MAX / n * n would do.

The function assumes that <code>rand()</code> is uniformly distributed; whether or not that is a valid assumption depends on the implementation of <code>rand()</code>. Given a uniform <code>rand()</code>, we can get a random number in the range <code>[0,n)</code> by calculating <code>rand()%n</code>. However, in general, this won't be quite uniform. For example, suppose <code>n</code> is 3 and <code>RAND_MAX</code> is 7: <pre class="prettyprint"><code>rand() 0 1 2 3 4 5 6 7 rand() % n 0 1 2 0 1 2 0 1 </code></pre> We can see that 0 and 1 come up with a probability of 3/8, while 2 only comes up with a probability of 2/8: the distribution is not uniform. Your code discards any value of <code>rand()</code> greater or equal to the largest multiple of <code>n</code> that it can generate. Now each value has an equal probability: <pre class="prettyprint"><code>rand() 0 1 2 3 4 5 6 7 rand() % n 0 1 2 0 1 2 X X </code></pre> So 0,1 and 2 all come up with a probability of 1/3, as long as we are not so unlucky that the loop never terminates. Regarding your update: <blockquote> I think a simple top = RAND_MAX / n * n would do. </blockquote> If <code>RAND_MAX</code> were an exclusive bound (one more than the actual maximum), then that would be correct. Since it's an inclusive bound, we need to add one to get the exclusive bound; and since the following logic compares with <code>></code> against an inclusive bound, then subtract one again after the calculation: <pre class="prettyprint"><code>int top = ((RAND_MAX + 1) / n) * n - 1; </code></pre> However, if <code>RAND_MAX</code> were equal to <code>INT_MAX</code>, then the calculation would overflow; to avoid that, subtract <code>n</code> at the beginning of the calculation, and add it again at the end: <pre class="prettyprint"><code>int top = (((RAND_MAX - n) + 1) / n) * n - 1 + n; </code></pre>

uniformly distributed random number generation

Tags:

c++

math

Why does this code generates uniformly distributed numbers? I have some difficulties in understanding it. Could someone explain? Thanks.

int RandomUniform(int n) {  
  int top = ((((RAND_MAX - n) + 1) / n) * n - 1) + n;  
  int r;  
  do {  
    r = rand();  
  } while (r > top);  
  return (r % n);  
}

update: I do understand why rand()%n doesn't give you a uniformly distributed sequence. My question is why the

top = ((((RAND_MAX - n) + 1) / n) * n - 1) + n;

What's the concern here? I think a simple top = RAND_MAX / n * n would do.

901

asked Feb 04 '13 15:02

JASON

2 Answers

The function assumes that rand() is uniformly distributed; whether or not that is a valid assumption depends on the implementation of rand().

Given a uniform rand(), we can get a random number in the range [0,n) by calculating rand()%n. However, in general, this won't be quite uniform. For example, suppose n is 3 and RAND_MAX is 7:

rand()      0 1 2 3 4 5 6 7
rand() % n  0 1 2 0 1 2 0 1

We can see that 0 and 1 come up with a probability of 3/8, while 2 only comes up with a probability of 2/8: the distribution is not uniform.

Your code discards any value of rand() greater or equal to the largest multiple of n that it can generate. Now each value has an equal probability:

rand()      0 1 2 3 4 5 6 7
rand() % n  0 1 2 0 1 2 X X

So 0,1 and 2 all come up with a probability of 1/3, as long as we are not so unlucky that the loop never terminates.

Regarding your update:

I think a simple top = RAND_MAX / n * n would do.

If RAND_MAX were an exclusive bound (one more than the actual maximum), then that would be correct. Since it's an inclusive bound, we need to add one to get the exclusive bound; and since the following logic compares with > against an inclusive bound, then subtract one again after the calculation:

int top = ((RAND_MAX + 1) / n) * n - 1;

However, if RAND_MAX were equal to INT_MAX, then the calculation would overflow; to avoid that, subtract n at the beginning of the calculation, and add it again at the end:

int top = (((RAND_MAX - n) + 1) / n) * n - 1 + n;

199

answered Sep 20 '22 01:09

Mike Seymour

The underlying problem is this: suppose you have a random number generator my_rand() that produces value from 0 to 6, inclusive, and you want to generate values from 0 to 5, inclusive; if you run your generator and return my_rand() % 6, you won't get a uniform distribution. When my_rand() returns 0, you get 0; when it returns 1, you get 1, etc. until my_rand() returns 6; in that case my_rand() % 6 is 0. So overall, my_rand() % 6 will return 0 twice as often as any other value. The way to fix this is to not use values greater than 5, that is, instead of my_rand() % 5 you write a loop and discard values from my_rand() that are too large. That's essentially what the code in the question is doing. I haven't traced it through, but the usual implementation is to compute the largest multiple of n that is less than or equal to RAND_MAX, and whenever rand() returns a value that's greater than that multiple, go back and get a new value.

answered Sep 22 '22 01:09

Pete Becker

Related questions
                            
                                std::merge merging two std::vector coredump
                            
                                Same random numbers every loop iteration
                            
                                QMainWindow not tracking mouse with setMouseTracking()
                            
                                C++ include libraries
                            
                                Why there is a copy before assign?
                            
                                cin.getline() is skipping an input in C++ [duplicate]
                            
                                enforce order of function calls?
                            
                                Multithreaded, read-only access to a Vector. Copy or lock?
                            
                                Where can I find information on the C++ [[deprecated]] attribute
                            
                                inserting int variable in file name [duplicate]
                            
                                Are google log entries wrapped by a mutex?
                            
                                C++11 atomic x86 memory ordering
                            
                                File I/O using COCOS2D-X
                            
                                C++ Inline methods for performance
                            
                                How to create a C++ template to pass a value in the best way?
                            
                                Proper way close WinAPI HANDLEs (avoiding of repeated closing)
                            
                                How to use string::find to find either "+" or "-" in one operation
                            
                                Find out if 2 lines intersect [duplicate]
                            
                                C++: dynamic_cast causes a SEGFAULT even when the object that is casted is not NULL. How can that happen?
                            
                                Output stream as class member

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With