Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stable/repeatable random sort (MySQL, Rails)

I'd like to paginate through a randomly sorted list of ActiveRecord models (rows from MySQL database).

However, this randomization needs to persist on a per-session basis, so that other people that visit the website also receive a random, paginate-able list of records.

Let's say there are enough entities (tens of thousands) that storing the randomly sorted ID values in either the session or a cookie is too large, so I must temporarily persist it in some other way (MySQL, file, etc.).

Initially I thought I could create a function based on the session ID and the page ID (returning the object IDs for that page) however since the object ID values in MySQL are not sequential (there are gaps), that seemed to fall apart as I was poking at it. The nice thing is that it would require no/minimal storage but the downsides are that it is likely pretty complex to implement and probably CPU intensive.

My feeling is I should create an intersection table, something like:

random_sorts( sort_id, created_at, user_id NULL if guest)

random_sort_items( sort_id, item_id, position )

And then simply store the 'sort_id' in the session. Then, I can paginate the random_sorts WHERE sort_id = n ORDER BY position LIMIT... as usual.

Of course, I'd have to put some sort of a reaper in there to remove them after some period of inactivity (based on random_sorts.created_at).

Unfortunately, I'd have to invalidate the sort as new objects were created (and/or old objects being removed, although deletion is very rare). And, as load increases the size/performance of this table (even properly indexed) drops.

It seems like this ought to be a solved problem but I can't find any rails plugins that do this... Any ideas? Thanks!!

like image 667
Matt Rogish Avatar asked Mar 08 '10 21:03

Matt Rogish


2 Answers

MySQL has a RAND function you can use in your ORDER clause, passing a seed tied to the user session.

ORDER BY RAND(?)

Where ? is a seed value from the session. This will give you repeatable ordering across requests.

like image 156
Toby Hede Avatar answered Nov 17 '22 05:11

Toby Hede


I'm probably missing something, but wouldn't something like this

select ... order by sha1(concat($session_id,item_id)) limit m,n;

work to give you a random-ordered, repeatable per-session paginated list ? Not very nice on index usage but you avoid any pre-filling / tmp tables / invalidation.

like image 28
ggiroux Avatar answered Nov 17 '22 03:11

ggiroux