Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How big can the argument to Perl's rand be?

Tags:

random

perl

rand(n) returns a number between 0 and n. Will rand work as expected, with regard to "randomness", for all arguments up to the integer limit on my platform?

like image 484
Tim Avatar asked May 13 '11 14:05

Tim


2 Answers

This is going to depend on your randbits value:

rand calls your system's random number generator (or whichever one was compiled into your copy of Perl). For this discussion, I'll call that generator RAND to distinguish it from rand, Perl's function. RAND produces an integer from 0 to 2**randbits - 1, inclusive, where randbits is a small integer. To see what it is in your perl, use the command 'perl -V:randbits'. Common values are 15, 16, or 31.

When you call rand with an argument arg, perl takes that value as an integer and calculates this value.

                        arg * RAND
          rand(arg) = ---------------
                        2**randbits

This value will always fall in the range required.

          0  <=  rand(arg)  < arg

But as arg becomes large in comparison to 2**randbits, things become problematic. Let's imagine a machine where randbits = 15, so RAND ranges from 0..32767. That is, whenever we call RAND, we get one of 32768 possible values. Therefore, when we call rand(arg), we get one of 32768 possible values.

like image 183
onteria_ Avatar answered Nov 14 '22 22:11

onteria_


It depends on the number of bits used by your system's (pseudo)random number generator. You can find this value via

perl -V:randbits

or within a program via

use Config;
my $randbits = $Config{randbits};

rand can generate 2^randbits distinct random numbers. While you can generate numbers larger than 2^randbits, you can't generate all of the integer values in the range [0, N) when N > 2^randbits.

Values of N which aren't a power of two can also be problematic, as the distribution of (integer truncated) random values won't quite be flat. Some values will be slightly over-represented, others slightly under-represented.

It's worth noting that randbits is a paltry 15 on Windows. This means you can only get 32768 (2**15) distinct values. You can improve the situation by making multiple calls to rand and combining the values:

use Config;
use constant RANDBITS => $Config{randbits};
use constant RAND_MAX => 2**RANDBITS;

sub double_rand {
    my $max = shift || 1;
    my $iv  =
          int rand(RAND_MAX) << RANDBITS
        | int rand(RAND_MAX);
    return $max * ($iv / 2**(2*RANDBITS));
}

Assuming randbits = 15, double_rand mimics randbits = 30, providing 1073741824 (2**30) possible distinct values. This alleviates (but can never eliminate) both of the problems mentioned above.

like image 23
Michael Carman Avatar answered Nov 14 '22 21:11

Michael Carman