Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Return random results ( order by rand() )

Tags:

random

mysql

I remember reading somewhere that using order by rand() is bad and I just startpaged it and found an article that proves it. Order by rand() can be extremely slow with large databases and the suggested solution was to generate a random number in php and select based upon it. The problem is that I need to verify other fields in order to return my records. I may also have some old records deleted, that may also cause an issue. Can anyone provide a decent way to select a few random records from a table that match certain conditions ( for example field paid must be equal to 1 ) ?

like image 233
php_nub_qq Avatar asked Jun 04 '13 21:06

php_nub_qq


People also ask

What does order by rand () do?

The ORDER BY RAND() technique in MySQL works to select the column values or records from the database table displayed randomly. The SELECT statement is used to query this technique. We will sort the records fetched with a query in MySQL using a specific function RAND().

What is rand () in MySQL?

The RAND() function in MySQL is used to a return random floating-point value V in the range 0 <= V < 1.0. If we want to obtain a random integer R in the range i <= R < j, we have to use the expression : FLOOR(i + RAND() * (j − i)).

How do I query random records in SQL?

To get a single row randomly, we can use the LIMIT Clause and set to only one row. ORDER BY clause in the query is used to order the row(s) randomly. It is exactly the same as MYSQL. Just replace RAND( ) with RANDOM( ).


2 Answers

The reason that ordering by RAND() can be slow is that you're forcing the database to actually sort the whole table before returning anything. Just reducing the load to a single table scan is much faster (albeit still somewhat slow).

This means that you could get part of the way just by avoiding the ordering:

  SELECT *
    FROM my_table
   WHERE RAND() < 0.1
ORDER BY RAND()
   LIMIT 100

This will select approximately 1% of all the rows in the table, sort them and return the top 100. Just note that the main issue here (as well as with @cmd's answer) is that you can't be sure that the query returns anything at all.

The approach above should involve a whole table scan (to decide which rows to use) followed by a sort of approximately 1% of the rows. If you have a lot of rows, you can reduce the percentage accordingly.

like image 132
mzedeler Avatar answered Oct 01 '22 02:10

mzedeler


How random do you need them to be? if you dont need a super even distribution try this

select min(pk_id) from my_table where pk_id > %(random_number)s and paid=1 

where %(random_number)s is a bind variable containing a random number from 0 to max(pk_id)-1 regenerated each time you run the query

like image 34
cmd Avatar answered Oct 01 '22 02:10

cmd