Repeatable pagination with randomly ordered rows

Question

I have API that returns dome paginated rows from DB. It works, however when I order rows by RANDOM() I get duplicates on consecutive pages. Is there any option to set random seed per query?

If not is it possible to set random SEED globally to force RANDOM() to generate same values per query? Then I could just change global random every 3 minutes or something like that...

U use this code:

SELECT * FROM "table" ORDER BY RANDOM() OFFSET 5 LIMIT 5

Now I want pass seed to this query so I can paginate random results. I should do this like this?:

SELECT "table".*, SETSEED(0.1) FROM "table" ORDER BY RANDOM() OFFSET 5 LIMIT 5
SELECT "table".*, SETSEED(0.1) FROM "table" ORDER BY RANDOM() OFFSET 10 LIMIT 5

And results will be correctly paginated?

Nathan Long · Accepted Answer

If the order needs to be "shuffled" but not truly random...

(Update: see my other answer for a more flexible and randomizable solution.)

You say "random" order, which is what you get when calling ORDER BY random() - for each row, PostgreSQL calls random(), gets a value, and uses that to decide how to sort that row within the set of results.

To make this repeatable, you have to mess with seeds. This feels icky. According to the docs:

the effects will persist until the end of the session, unless overridden by another SET

I think this means that when using a connection pool, setseed mutates the connection for the next process that uses that connection.

What about modulo?

I have a case where I don't need true randomness. My criteria are:

not the same order every time
predictable order within pages of the same result set, so that we don't get duplicates on subsequent pages

Eg, this would be fine:

Listing 1
- page 1: items 1, 4
- page 2: items 3, 2
Listing 2 (different user, or same user coming back later)
- page 1: items 3, 1
- page 2: items 2, 4

To get something like this, modulo seems to work well. Eg, ORDER BY id % 7, id for all pages of request 1, and ORDER BY id % 11, id for all pages of request 2. That is, for each row, divide its id by the modulus and sort by the remainder. Within rows with the same remainder, sort by id (to ensure the sort is stable).

The modulus could be picked randomly for the first page, then reused as a parameter for each subsequent page request.

You can see how this might work for your database like this:

echo "select id, id % 7 FROM my_table ORDER BY id % 77, id" | psql my_db > sort.txt

A prime modulus will probably give you the most variation. And if your ids start at 1 (such that % 77 would make the first 77 rows return in the normal order), you could try doing a modulus on a timestamp field instead. Eg:

ORDER BY (extract(epoch from inserted_at)* 100000)::bigint % 77

But you'd need a function index to make that performant.

Clodoaldo Neto · Answer

With this union all technique the random order is repeatable

select a, b
from (
    select setseed(0.1), null as a, null as b

    union all

    select null, a, b
    from t

    offset 1
) s
order by random()
offset 0
limit 5
;

Repeatable pagination with randomly ordered rows

Tags:

random

postgresql

pagination

random-seed

user606521

2 Answers

If the order needs to be "shuffled" but not truly random...

What about modulo?

Nathan Long

Clodoaldo Neto

Recent Activity

Donate For Us

Repeatable pagination with randomly ordered rows

Tags:

random

postgresql

pagination

random-seed

user606521

2 Answers

If the order needs to be "shuffled" but not truly random...

What about modulo?

Nathan Long

Clodoaldo Neto

Related questions

Recent Activity

Donate For Us