Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Will the difference between RAND_MAX and UINT_MAX vary?

My homework involves making random integers between 0 and 2^30. Now, in the past we've learned that rand() only returns integers up to RAND_MAX, that this is less than UINT_MAX, and that we can use bit shifting to fill up that UINT_MAX capacity. From some reading I've done (here, on SO), I realise that this may not be a very good idea if the distribution of these numbers matters to me. Having said that, my professor has specified this method.

My question is, by how much to bit shift? Will the difference between RAND_MAX and UINT_MAX always be such that there is a safe constant by which to bit shift? Or is there some initial probing that needs to be done to determine the number by which to bit shift? Should I just keep bit shifting by a little bit and checking against UINT_MAX?

The reason I ask is that, UINT_MAX is defined to be at least a certain number (65535), but on my machine UINT_MAX is much bigger (4294967295). This got me worried that I might finish my homework over the weekend, arrive at school, and find everything not working well enough to submit.

Thanks!

References:

I've read a couple of similar questions but couldn't garner an answer from them.

Is the value of RAND_MAX always (2^n)-1?

generating a random number within range 0 to n where n can be > RAND_MAX

Actually, the second question above makes me question the worth of doing this at all?

like image 685
Ziggy Avatar asked Nov 27 '11 04:11

Ziggy


People also ask

What is rand MAX in C?

Macro: int RAND_MAX. The value of this macro is an integer constant representing the largest value the rand function can return. In the GNU C Library, it is 2147483647 , which is the largest signed integer representable in 32 bits. In other libraries, it may be as low as 32767 .

How rand works in C?

The rand function, declared in stdlib. h, returns a random integer in the range 0 to RAND_MAX (inclusive) every time you call it. On machines using the GNU C library RAND_MAX is equal to INT_MAX or 231-1, but it may be as small as 32767.


2 Answers

Your question centers around whether RAND_MAX and UINT_MAX have a bit shift between them. This reduces to the question of whether UINT_MAX and RAND_MAX are of the form 2^k - 1. UINT_MAX almost assuredly would be on any binary number system-based computer. If sizeof(int)=32 bits then k=32, if sizeof(int)=64 bit then k=64, etc. Now we can move to consider RAND_MAX. In most implementations the answer is that RAND_MAX will almost always be of the form 2^k - 1. Why? We need to consider how most implementations of rand() actually work.

  1. rand() typically uses a linear congruential generator (see http://en.wikipedia.org/wiki/Linear_congruential_generator or Knuth "The Art of Computer Programmer Part 2 : Semi-Numerical Algorithms"). Basically a random number is a sequence with a seed

    x(k+1) = ( a x(k) + c ) % m

(i.e. the C library stores the last iterated x(k) and when you call rand() it returns x(k+1))

  1. To be of good quality, the parameters of the generator (a, c, and m) have to be chosen carefully. Quality typically entails how many times before the sequence repeats itself, etc. One tension in the choice of these parameters is to make m as close to UINT_MAX as possible so as to not avoid wasting potential random bits. If you study the generators, typically the right choice is to make m somewhat smaller than UINT_MAX. You also need to make m a prime.

  2. Typically you want rand() to be as fast as possible so you want these operations to be cheap. The cheapest mod to compute is one of the form foo % (2^k - 1) because it can be implemented as foo & (1<<k-1). For a special choice of k you will get a Mersenne prime.

For example, A common choice is k=31 which yields the prime 2^31-1 = 2147483647. This is the typical choice for 32-bit integers where UINT_MAX=2^32-1 = 4294967295. For 64-bit numbers one has UINT_MAX=2^64-1=18446744073709551615 and a choice for RAND_MAX would be 2^61-1 = 2305843009213693951.

So in summary, to answer your question: In most implementations you can assume there is a simple bit shift, however, there is no real guarantee. At the very least you should do a run-time test at your program init. If you are using C++ a better practice would be to use a static_assert to detect at compile time whether your assumptions are true and fail to compile if they are not. Boost has such a static assert, as does the recently approved standard C++11... i.e. one can do (though it may take some work to write a static version of is_power_of_two_minus_one):

unsigned int myrand()
{
        static_assert(sizeof(int)==4,"sizeof(unsigned int) != 4");
        static_assert(is_power_of_two_minus_one(RAND_MAX),"RAND_MAX not a power of two minus one");
        static_assert(is_power_of_two_minus_one(UINT_MAX),"UINT_MAX not power of two minus one");
        unsigned int raw_rand=rand();
        // do your bit shift to adjust raw_rand
        return raw_rand;
}
like image 87
aselle Avatar answered Oct 21 '22 03:10

aselle


Actually, the second question above makes me question the worth of doing this at all?

If your professor told you to do it this way, you should do it this way. It's not crypto-strength certainly, but for the purposes of a homework assignment it will be fine.

For RAND_MAX being 2^n - 1, I'd generally assume so. The way computers generate random numbers is a set number of bits at a time, so if the max weren't 2^n - 1, either not all numbers in the range could be returned or entropy would be wasted.

As for the difference being the same on all systems, I'd strongly advise against assuming that. In your code, figure out how many bits are in each and dynamically figure out how to shift.

And can't you ssh in to the school server you'll be (eventually) running this on?

like image 25
Kevin Avatar answered Oct 21 '22 02:10

Kevin