What is a fast way to select a random row from a large mysql table?
I'm working in php, but I'm interested in any solution even if it's in another language.
First, we have specified the table name to which we are going to select random records. Second, we have specified the RAND function that returns random values for each row in the table. Third, we have specified an ORDER BY This clause sorts all table rows by the random number generated by the RAND() function.
NewID() Creates a unique value of type uniqueidentifier. and your table will be sorted by this random values.
To get the random value between two values, use MySQL rand() method with floor(). The syntax is as follows. select FLOOR( RAND() * (maximumValue-minimumValue) + minimumValue) as anyVariableName; Let us check with some maximum and minimum value.
Grab all the id's, pick a random one from it, and retrieve the full row.
If you know the id's are sequential without holes, you can just grab the max and calculate a random id.
If there are holes here and there but mostly sequential values, and you don't care about a slightly skewed randomness, grab the max value, calculate an id, and select the first row with an id equal to or above the one you calculated. The reason for the skewing is that id's following such holes will have a higher chance of being picked than ones that follow another id.
If you order by random, you're going to have a terrible table-scan on your hands, and the word quick doesn't apply to such a solution.
Don't do that, nor should you order by a GUID, it has the same problem.
I knew there had to be a way to do it in a single query in a fast way. And here it is:
A fast way without involvement of external code, kudos to
http://jan.kneschke.de/projects/mysql/order-by-rand/
SELECT name
FROM random AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(id)
FROM random)) AS id)
AS r2
WHERE r1.id >= r2.id
ORDER BY r1.id ASC
LIMIT 1;
MediaWiki uses an interesting trick (for Wikipedia's Special:Random feature): the table with the articles has an extra column with a random number (generated when the article is created). To get a random article, generate a random number and get the article with the next larger or smaller (don't recall which) value in the random number column. With an index, this can be very fast. (And MediaWiki is written in PHP and developed for MySQL.)
This approach can cause a problem if the resulting numbers are badly distributed; IIRC, this has been fixed on MediaWiki, so if you decide to do it this way you should take a look at the code to see how it's currently done (probably they periodically regenerate the random number column).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With