Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL select: random order values from one table for each row in another

I have two tables in my database:

(1) PHRASES:

t_phrase
========
I like
They prefer
...
Somebody else wants

and

(2) PLACES:

n_id   t_place
====   =======
1      London
2      Paris
...
N      New York

Table PHRASES has at least as many rows as PLACES. I need to join these two tables in such a way as to select all places with one phrase for each of them - but phrases need to be randomly distributed across places. The overall places table isn't too big: maybe, about 3-4 thousand rows, however there will be an additional WHERE clause on it that will limit the output to about 200 places at most.

Ideally, I'd like this to be in one SQL statement, but so far I haven't been able to get my head around this. Therefore the second option is a stored function returning a row of (int, varchar, varchar). For this, I was thinking of something along the lines of:

  1. select all phrases in random order into an array of varchar
  2. loop over places taking one at a time and returning it along with the next phrase from the array

Somehow this seems to me very inefficient, but I can't come up with anything better.

Can you suggest any better idea? Or, even better, one statement SQL, maybe?

Thanks in advance.

EDIT: Please note that the phrases should NOT be repeated in the resultset. There are always at least as many phrases as there are places.

like image 532
Aleks G Avatar asked Jun 06 '26 04:06

Aleks G


1 Answers

WITH p AS (
    SELECT place, row_number() OVER () AS rn
    FROM   t_place
    WHERE  <some condition>
    )
    , ph AS (
    SELECT phrase, row_number() OVER (ORDER BY random()) AS rn
    FROM   t_phrase
    )
SELECT ph.phrase, p.place
FROM   p
JOIN   ph USING (rn);

It won't get any more random, if you impose a truly random order on both tables, it will only get slower. I impose the random order on phrases, because:

There are always at least as many phrases as there are places.

It needs to be done with the bigger set, lest some non-random part might get cut off. For the smaller set (places), on the other hand, any sequence of numbers without gaps is good, so I pick the fastest way.

My example uses CTEs, but it can be done with subqueries just as well. Both CTE and window functions require PostgreSQL 8.4 or later.

like image 148
Erwin Brandstetter Avatar answered Jun 09 '26 02:06

Erwin Brandstetter