Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select a random row from Oracle DB in a performant way

Using : Oracle Database 12c Enterprise Edition Release 12.1.0.2.0

I am trying to fetch a random row. As suggested in other stackoverflow questions, I used DBMS_RANDOM.VALUE like this -

SELECT column FROM
( SELECT column 
  FROM table
  WHERE COLUMN_VALUE = 'Y' -- value of COLUMN_VALUE
  ORDER BY dbms_random.value 
)
WHERE rownum <= 1

But this query isn't performant when the number of requests increase. So I am looking for an alternative.

SAMPLE wouldn't work for me because the sample picked up through the clause wouldn't have a dataset that matches my WHERE clause. The query looked like this -

SELECT column FROM table SAMPLE(1) WHERE COLUMN_VALUE = 'Y'

Because the SAMPLE is applied before my WHERE clause, most times this returns no data.

P.S: I am ok to move some part of the logic to application layer (though i am definitely not looking for answers that suggest loading everything to memory)

like image 809
Atty Avatar asked Dec 10 '25 12:12

Atty


1 Answers

The performance problems consist of two aspects:

  • selecting the data with column_value = 'Y' and

  • sorting this subset to get a random record

You didn't say if the subset of your table with column_value = 'Y' is a large or small. This is important and will drive your strategy.

If there are lots of records with column_value = 'Y' use the SAMPLE to limit the rows to by sorted. You are right, this could lead to empty result - in this case repeat the query (you may additionally add a logic that increases the sample percent to avoid lot of repeating). This will boost performance while you sort ony a sample of the data

select id from (
select id from tt SAMPLE(1) where column_value = 'Y' order by  dbms_random.value )
where rownum <= 1; 

If there are only few records with column_value = 'Y' define an index on this column (or a separate partition) - this enables a effiective access to the records. Use the order by dbms_random.value approach. Sort will not degradate performance for small number of rows.

select id from (
select id from tt where column_value = 'Y' order by  dbms_random.value )
where rownum <= 1;

Basically both approaches keep the sorted rows in small size. The first approach perform a table access comparable with FULL TABLE SCAN, the second performs INDEX ACCESS for the selected column_value.

like image 183
Marmite Bomber Avatar answered Dec 12 '25 11:12

Marmite Bomber



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!