I'd like to optimize my queries so I look into mysql-slow.log
.
Most of my slow queries contains ORDER BY RAND()
. I cannot find a real solution to resolve this problem. Theres is a possible solution at MySQLPerformanceBlog but I don't think this is enough. On poorly optimized (or frequently updated, user managed) tables it doesn't work or I need to run two or more queries before I can select my PHP
-generated random row.
Is there any solution for this issue?
A dummy example:
SELECT accomodation.ac_id, accomodation.ac_status, accomodation.ac_name, accomodation.ac_status, accomodation.ac_images FROM accomodation, accomodation_category WHERE accomodation.ac_status != 'draft' AND accomodation.ac_category = accomodation_category.acat_id AND accomodation_category.acat_slug != 'vendeglatohely' AND ac_images != 'b:0;' ORDER BY RAND() LIMIT 1
MySQL select random records using ORDER BY RAND() The function RAND() generates a random value for each row in the table. The ORDER BY clause sorts all rows in the table by the random number generated by the RAND() function. The LIMIT clause picks the first row in the result set sorted randomly.
Yes, MySQL uses your index to sort the information when the order is by the sorted column. Also, if you have indexes in all columns that you have added to the SELECT clause, MySQL will not load the data from the table itself, but from the index (which is faster).
Use of filesort to Satisfy ORDER BY If an index cannot be used to satisfy an ORDER BY clause, MySQL performs a filesort operation that reads table rows and sorts them. A filesort constitutes an extra sorting phase in query execution.
Try this:
SELECT * FROM ( SELECT @cnt := COUNT(*) + 1, @lim := 10 FROM t_random ) vars STRAIGHT_JOIN ( SELECT r.*, @lim := @lim - 1 FROM t_random r WHERE (@cnt := @cnt - 1) AND RAND(20090301) < @lim / @cnt ) i
This is especially efficient on MyISAM
(since the COUNT(*)
is instant), but even in InnoDB
it's 10
times more efficient than ORDER BY RAND()
.
The main idea here is that we don't sort, but instead keep two variables and calculate the running probability
of a row to be selected on the current step.
See this article in my blog for more detail:
Update:
If you need to select but a single random record, try this:
SELECT aco.* FROM ( SELECT minid + FLOOR((maxid - minid) * RAND()) AS randid FROM ( SELECT MAX(ac_id) AS maxid, MIN(ac_id) AS minid FROM accomodation ) q ) q2 JOIN accomodation aco ON aco.ac_id = COALESCE ( ( SELECT accomodation.ac_id FROM accomodation WHERE ac_id > randid AND ac_status != 'draft' AND ac_images != 'b:0;' AND NOT EXISTS ( SELECT NULL FROM accomodation_category WHERE acat_id = ac_category AND acat_slug = 'vendeglatohely' ) ORDER BY ac_id LIMIT 1 ), ( SELECT accomodation.ac_id FROM accomodation WHERE ac_status != 'draft' AND ac_images != 'b:0;' AND NOT EXISTS ( SELECT NULL FROM accomodation_category WHERE acat_id = ac_category AND acat_slug = 'vendeglatohely' ) ORDER BY ac_id LIMIT 1 ) )
This assumes your ac_id
's are distributed more or less evenly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With