I have this table,
person_id int(10) pk
points int(6) index
other columns not very important
I have this random function which is very fast on a table with 10M rows:
SELECT person_id
FROM persons AS r1 JOIN
(SELECT (RAND() *
(SELECT MAX(person_id)
FROM persons)) AS id)
AS r2
WHERE r1.person_id >= r2.id
ORDER BY r1.person_id ASC
LIMIT 1
This is all great but now I wish to show only people with points > 0. Example table:
PERSON_ID POINTS
1 4
2 6
3 0
4 3
When I append AND points > 0
to the where clause, person_id 3 can't be selected, so a gap is created and when the random select person_id 3, person_id 4 will be selected. This gives person 4 a bigger chance to be chosen. Any one got suggestions how I can adjust the query to make it work with the where clause and give all rows same % of chance to be selected.
Info table: The table is uniform, no gaps in person_id's. About 90% will have 0 points. I want to make the query for where points = 0 and points > 0.
Before someone will say, use rand()
: this is not solution for tables with more than a few 100k rows.
Bonus question: will it be possible to select x random rows in 1 query, so I do not have to call this query a few times when i want more random rows?
Important note: performance is key, with 10M+ rows the query may not take much longer than the current query, which takes 0.0005 seconds, I prefer to stay under 0.05 second.
Last note: If you think the query will never be this fast with above requirements, but another solution is possible (like fetching 100 rows and showing x random which has more than 0 points), please tell :)
Really appreciate your help and all help is welcome :)
You could generate in-line gap-free id's for the records that you really want to work with, and generate then the random selector using the total number of records available.
Try with this (props to the chosen answer here for the row_number generator):
SELECT r1.*
FROM
(SELECT person_id,
@curRow := @curRow + 1 AS row_number
FROM persons as p,
(SELECT @curRow := 0) r0
WHERE points>0) r1
, (SELECT COUNT(1) * RAND() id
FROM persons
WHERE points>0) r2
WHERE r1.person_id>=r2.id
ORDER BY r1.person_id ASC
LIMIT 1;
You can mess with it in this sqlfiddle.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With