Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL RAND() how often can it be used? does it use /dev/random?

Tags:

php

random

mysql

I have a table with few rows (tops 50), I need to get random value out of table I can do that by
ORDER BY RAND() LIMIT 1
Main question is in the point when I have 6k selects in 5 seconds is rand stil 'reliable'? How is rand calculated, can I seed it over time? (idk, every 5 seconds).

like image 883
Kyslik Avatar asked Apr 02 '13 14:04

Kyslik


People also ask

What is the max number of rand ()?

The value of this macro is an integer constant representing the largest value the rand function can return. In the GNU C Library, it is 2147483647 , which is the largest signed integer representable in 32 bits. In other libraries, it may be as low as 32767 .

What is rand () in MySQL?

RAND() Return a random floating-point value.

What happens if Srand () function is not used before generation of random number using rand ()?

If srand() is not called, the rand() seed is set as if srand(1) were called at the program start. Any other value for seed sets the generator to a different starting point.

Why does rand () give the same value?

The RAND function in stand-alone applications generates the same numbers each time you run your application because the uniform random number generator that RAND uses is initialized to same state when the application is loaded.


2 Answers

The MySQL pseudo-random number generator is completely deterministic. The docs say:

RAND() is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.

It can't use /dev/random because MySQL is designed to work on a variety of operating systems, some of which don't have a /dev/random.

MySQL initializes a default seed at server startup, using the integer returned by time(0). If you're interested in the source line, it's in the MySQL source in file sql/mysqld.cc, function init_server_components(). I don't think it ever re-seeds itself.

Then the subsequent "random" numbers are based solely on the seed. See source file mysys_ssl/my_rnd.cc, function my_rnd().


The best practice solution to your random-selection task, for both performance and quality of randomization, is to generate a random value between the minimum primary key value and maximum primary key value. Then use that random value to select a primary key in your table:

SELECT ... FROM MyTable WHERE id > $random LIMIT 1

The reason you'd use > instead of = is that you might have gaps in the id due to rows being deleted or rolled back, or you might have other conditions in your WHERE clause so that you have gaps in between rows that match your conditions.

The disadvantages of this greater-than method:

  • Rows following such a gap have a higher chance of being chosen, and the larger the gap the greater the chance.
  • You need to know the MIN(id) and MAX(id) before you generate the random value.
  • Doesn't work as well if you need more than one random row.

Advantages of this method:

  • It's much faster than ORDER BY RAND(), even for a modest table size.
  • You can use a random function outside of SQL.
like image 69
Bill Karwin Avatar answered Sep 29 '22 04:09

Bill Karwin


RAND is pseudorandom. Be careful using it for security stuff. I don't think your "choose one row randomly out of fifty" is for security, so you're probably OK.

It's pretty fast for a small table. It will be horrible for picking a random row out of a large table: it will has to tag every row with a pseudorandom number and then sort them. For the application you're describing, @TheEwook's suggestion is exactly right; sorting even a small table more often than once a millisecond can swamp even powerful MySQL hardware.

Don't seed RAND, ever, unless you're testing and you want a repeatable sequence of random numbers for some kind of unit test. I learned this the hard way once when generating what I thought were hard-to-guess session tokens. The MySQL guys did a good job with RAND and you can trust them for the application you're talking about.

I think (not sure), if you don't seed it, it starts with a random seed from /dev/random.

If you need crypto-grade random numbers, read /dev/random yourself. But keep in mind that /dev/random can only generate a limited rate. /dev/urandom uses /dev/random to generate a faster rate, but isn't as high-grade in its entropy pool.

like image 37
O. Jones Avatar answered Sep 29 '22 06:09

O. Jones