Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the disadvantage of mt_rand?

What's the definition of bias in:

The distribution of mt_rand() return values is biased towards even numbers on 64-bit builds of PHP when max is beyond 2^32.

If it's the kind of bias stated in alternate tie-breaking rules for rounding, I don't think it really matters (since the bias is not really visible).

Besides mt_rand() is claimed to be four times faster than rand(), just by adding three chars in front!

Assuming mt_rand is available, what's the disadvantage of using it?

like image 361
Pacerier Avatar asked Oct 18 '11 13:10

Pacerier


2 Answers

mt_rand uses the Mersenne Twister algorithm, which is far better than the LCG typically used by rand. For example, the period of an LCG is a measly 232, whereas the period of mt_rand is 219937 − 1. Also, all the values generated by an LCG will lie on lines or planes when plotted into a multidimensional space. Also, it is not only practically feasible, but relatively easy to determine the parameters of an LCG. The only advantage LCGs have is being potentially slightly faster, but on a scale that is completely irrelevant when coding in php.

However, mt_rand is not suitable for cryptographic purposes (generation of tokens, passwords or cryptographic keys) either.

If you need cryptographic randomness, use random_int in php 7. On older php versions, read from /dev/urandom or /dev/random on a POSIX-conforming operating system.

like image 76
phihag Avatar answered Sep 28 '22 01:09

phihag


The distribution quirk that you quoted is only relevant when the random number range you're generating is larger than 2^32. That is 4294967296.

If you're working with numbers that big, and you need them to be randomised, then perhaps this is a reason to reconsider using mt_rand(). However if your working with numbers smaller than this, then it is irrelevant.

The reason it happens is due to the precision of the random number generator not being good enough in those high ranges.

I've never worked with random numbers that large, so I've never needed to worry about it.

The difference between rand() and mt_rand() is a lot more than "just three extra characters". They are entirely different function calls, and work in completly different ways. Just the same as you don't expect print() and print_r() to be similar.

mt_rand() gets it's name from the "Mersene Twister" algorithm it uses to generate the random numbers. This algorithm is known to be a quick, efficient and high quality random number generator, which is why it is available in PHP.

The older rand() function makes use of the operating system's random number generator by making a system call. This means that it uses whatever random number generator happens to be the default on the operating system you're using. In general, the default random number generator uses a much slower and older algorithm, hence the claim that my_rand() is quicker, but it will vary from system to system.

Therefore, for virtually all uses, mt_rand() is a better function to use than rand().

You say "assuming mt_rand() is available", but it always will be since it was introduced way back in PHP4.

like image 34
Spudley Avatar answered Sep 27 '22 23:09

Spudley