I have a MySQL table that has a row called cur_odds which is a percent number with the percent probability that that row will get selected. How do I make a query that will actually select the rows in approximately that frequency when you run through 100 queries for example?
I tried the following, but a row that has a probability of 0.35 ends up getting selected around 60-70% of the time.
SELECT * FROM table ORDER BY RAND()*cur_odds DESC
All the values of cur_odds in the table add up to 1 exactly.
If cur_odds
is changed rarely you could implement the following algorithm:
1) Create another column prob_sum
, for which
prob_sum[0] := cur_odds[0]
for 1 <= i <= row_count - 1:
prob_sum[i] := prob_sum[i - 1] + cur_odds[i]
2) Generate a random number from 0 to 1:
rnd := rand(0,1)
3) Find the first row for which prob_sum > rnd
(if you create a BTREE index on the prob_sum
, the query should work much faster):
CREATE INDEX prob_sum_ind ON <table> (prob_sum);
SET @rnd := RAND();
SELECT MIN(prob_sum) FROM <table> WHERE prob_sum > @rnd;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With