I have a table with few rows (tops 50), I need to get random value out of table I can do that by ORDER BY RAND() LIMIT 1
Main question is in the point when I have 6k selects in 5 seconds is rand stil 'reliable'?
How is rand calculated, can I seed it over time? (idk, every 5 seconds).
The value of this macro is an integer constant representing the largest value the rand function can return. In the GNU C Library, it is 2147483647 , which is the largest signed integer representable in 32 bits. In other libraries, it may be as low as 32767 .
RAND() Return a random floating-point value.
If srand() is not called, the rand() seed is set as if srand(1) were called at the program start. Any other value for seed sets the generator to a different starting point.
The RAND function in stand-alone applications generates the same numbers each time you run your application because the uniform random number generator that RAND uses is initialized to same state when the application is loaded.
The MySQL pseudo-random number generator is completely deterministic. The docs say:
RAND() is not meant to be a perfect random generator. It is a fast way to generate random numbers on demand that is portable between platforms for the same MySQL version.
It can't use /dev/random because MySQL is designed to work on a variety of operating systems, some of which don't have a /dev/random.
MySQL initializes a default seed at server startup, using the integer returned by time(0)
.
If you're interested in the source line, it's in the MySQL source in file sql/mysqld.cc, function init_server_components()
. I don't think it ever re-seeds itself.
Then the subsequent "random" numbers are based solely on the seed. See source file mysys_ssl/my_rnd.cc, function my_rnd()
.
The best practice solution to your random-selection task, for both performance and quality of randomization, is to generate a random value between the minimum primary key value and maximum primary key value. Then use that random value to select a primary key in your table:
SELECT ... FROM MyTable WHERE id > $random LIMIT 1
The reason you'd use > instead of = is that you might have gaps in the id due to rows being deleted or rolled back, or you might have other conditions in your WHERE clause so that you have gaps in between rows that match your conditions.
The disadvantages of this greater-than method:
Advantages of this method:
RAND is pseudorandom. Be careful using it for security stuff. I don't think your "choose one row randomly out of fifty" is for security, so you're probably OK.
It's pretty fast for a small table. It will be horrible for picking a random row out of a large table: it will has to tag every row with a pseudorandom number and then sort them. For the application you're describing, @TheEwook's suggestion is exactly right; sorting even a small table more often than once a millisecond can swamp even powerful MySQL hardware.
Don't seed RAND, ever, unless you're testing and you want a repeatable sequence of random numbers for some kind of unit test. I learned this the hard way once when generating what I thought were hard-to-guess session tokens. The MySQL guys did a good job with RAND and you can trust them for the application you're talking about.
I think (not sure), if you don't seed it, it starts with a random seed from /dev/random.
If you need crypto-grade random numbers, read /dev/random yourself. But keep in mind that /dev/random can only generate a limited rate. /dev/urandom uses /dev/random to generate a faster rate, but isn't as high-grade in its entropy pool.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With