Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL: Alternatives to ORDER BY RAND()

I've read about a few alternatives to MySQL's ORDER BY RAND() function, but most of the alternatives apply only to where on a single random result is needed.

Does anyone have any idea how to optimize a query that returns multiple random results, such as this:

   SELECT u.id,            p.photo       FROM users u, profiles p      WHERE p.memberid = u.id        AND p.photo != ''        AND (u.ownership=1 OR u.stamp=1)   ORDER BY RAND()      LIMIT 18  
like image 418
Tony Avatar asked Dec 01 '09 00:12

Tony


People also ask

What is ORDER BY RAND in MySQL?

MySQL select random records using ORDER BY RAND() The function RAND() generates a random value for each row in the table. The ORDER BY clause sorts all rows in the table by the random number generated by the RAND() function. The LIMIT clause picks the first row in the result set sorted randomly.

How do I sort in MySQL?

The ORDER BY keyword is used to sort the result-set in ascending or descending order. The ORDER BY keyword sorts the records in ascending order by default. To sort the records in descending order, use the DESC keyword.

Does order matter in MySQL?

So the order of columns in a multi-column index definitely matters. One type of query may need a certain column order for the index. If you have several types of queries, you might need several indexes to help them, with columns in different orders.

How does MySQL order rows with the same value?

The order that rows are returned in is guaranteed ONLY by ORDER BY clause (or in MySQL, an ORDER BY implicitly specified in the GROUP BY clause.) Apart from that, there is NO GUARANTEE of the order rows will be returned in. Apart from that, MySQL is free to return the rows in any sequence.


2 Answers

UPDATE 2016

This solution works best using an indexed column.

Here is a simple example of and optimized query bench marked with 100,000 rows.

OPTIMIZED: 300ms

SELECT      g.* FROM     table g         JOIN     (SELECT          id     FROM         table     WHERE         RAND() < (SELECT                  ((4 / COUNT(*)) * 10)             FROM                 table)     ORDER BY RAND()     LIMIT 4) AS z ON z.id= g.id 

note about limit ammount: limit 4 and 4/count(*). The 4s need to be the same number. Changing how many you return doesn't effect the speed that much. Benchmark at limit 4 and limit 1000 are the same. Limit 10,000 took it up to 600ms

note about join: Randomizing just the id is faster than randomizing a whole row. Since it has to copy the entire row into memory then randomize it. The join can be any table that is linked to the subquery Its to prevent tablescans.

note where clause: The where count limits down the ammount of results that are being randomized. It takes a percentage of the results and sorts them rather than the whole table.

note sub query: The if doing joins and extra where clause conditions you need to put them both in the subquery and the subsubquery. To have an accurate count and pull back correct data.

UNOPTIMIZED: 1200ms

SELECT      g.* FROM     table g ORDER BY RAND() LIMIT 4 

PROS

4x faster than order by rand(). This solution can work with any table with a indexed column.

CONS

It is a bit complex with complex queries. Need to maintain 2 code bases in the subqueries

like image 98
Roger Avatar answered Sep 19 '22 16:09

Roger


Here's an alternative, but it is still based on using RAND():

  SELECT u.id,           p.photo,          ROUND(RAND() * x.m_id) 'rand_ind'     FROM users u,           profiles p,          (SELECT MAX(t.id) 'm_id'             FROM USERS t) x    WHERE p.memberid = u.id       AND p.photo != ''       AND (u.ownership=1 OR u.stamp=1)  ORDER BY rand_ind    LIMIT 18 

This is slightly more complex, but gave a better distribution of random_ind values:

  SELECT u.id,           p.photo,          FLOOR(1 + RAND() * x.m_id) 'rand_ind'     FROM users u,           profiles p,          (SELECT MAX(t.id) - 1 'm_id'             FROM USERS t) x    WHERE p.memberid = u.id       AND p.photo != ''       AND (u.ownership=1 OR u.stamp=1)  ORDER BY rand_ind    LIMIT 18 
like image 20
OMG Ponies Avatar answered Sep 16 '22 16:09

OMG Ponies