I have a table in postgres that contains couple of millions of rows. I have checked on the internet and I found the following
SELECT myid FROM mytable ORDER BY RANDOM() LIMIT 1;
It works, but it's really slow... is there another way to make that query, or a direct way to select a random row without reading all the table? By the way 'myid' is an integer but it can be an empty field.
There's an easy way to show a random record in a table: SELECT * FROM table_name ORDER BY RANDOM() LIMIT 1; But this query might take a while to finish as it reads the whole table first then take out a random record from it.
To get a single row randomly, we can use the LIMIT Clause and set to only one row. ORDER BY clause in the query is used to order the row(s) randomly. It is exactly the same as MYSQL. Just replace RAND( ) with RANDOM( ).
You might want to experiment with OFFSET
, as in
SELECT myid FROM mytable OFFSET floor(random() * N) LIMIT 1;
The N
is the number of rows in mytable
. You may need to first do a SELECT COUNT(*)
to figure out the value of N
.
Update (by Antony Hatchkins)
You must use floor
here:
SELECT myid FROM mytable OFFSET floor(random() * N) LIMIT 1;
Consider a table of 2 rows; random()*N
generates 0 <= x < 2
and for example SELECT myid FROM mytable OFFSET 1.7 LIMIT 1;
returns 0 rows because of implicit rounding to nearest int.
PostgreSQL 9.5 introduced a new approach for much faster sample selection: TABLESAMPLE
The syntax is
SELECT * FROM my_table TABLESAMPLE BERNOULLI(percentage);
SELECT * FROM my_table TABLESAMPLE SYSTEM(percentage);
This is not the optimal solution if you want only one row selected, because you need to know the COUNT of the table to calculate the exact percentage.
To avoid a slow COUNT and use fast TABLESAMPLE for tables from 1 row to billions of rows, you can do:
SELECT * FROM my_table TABLESAMPLE SYSTEM(0.000001) LIMIT 1;
-- if you got no result:
SELECT * FROM my_table TABLESAMPLE SYSTEM(0.00001) LIMIT 1;
-- if you got no result:
SELECT * FROM my_table TABLESAMPLE SYSTEM(0.0001) LIMIT 1;
-- if you got no result:
SELECT * FROM my_table TABLESAMPLE SYSTEM(0.001) LIMIT 1;
...
This might not look so elegant, but probably is faster than any of the other answers.
To decide whether you want to use BERNULLI oder SYSTEM, read about the difference at http://blog.2ndquadrant.com/tablesample-in-postgresql-9-5-2/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With