How can I generate a random number that is in the range (1,n)
but not in a certain list (i,j)
?
Example: range is (1,500)
, list is [1,3,4,45,199,212,344]
.
Note: The list may not be sorted
One method is rejection sampling:
x
in the range (1, 500)x
in your list of disallowed values? (Can use a hash-set for this check.)
x
is your random value, doneThis will work fine if your set of allowed values is significantly larger than your set of disallowed values:
if there are G
possible good values and B
possible bad values, then the expected number of times you'll have to sample x
from the G + B
values until you get a good value is (G + B) / G
(the expectation of the associated geometric distribution). (You can sense check this. As G
goes to infinity, the expectation goes to 1. As B
goes to infinity, the expectation goes to infinity.)
Another method is to make a list L
of all of your allowed values, then sample L[rand(L.count)]
.
The technique I usually use when the list is length 1 is to generate a random
integer r
in [1,n-1]
, and if r
is greater or equal to that single illegal
value then increment r
.
This can be generalised for a list of length k
for small k
but requires
sorting that list (you can't do your compare-and-increment in random order). If the list is moderately long, then after the sort you can start with a bsearch, and add the number of values skipped to r
, and then recurse into the remainder of the list.
For a list of length k
, containing no value greater or equal to n-k
, you
can do a more direct substitution: generate random r
in [1,n-k]
, and
then iterate through the list testing if r
is equal to list[i]
. If it is
then set r
to n-k+i
(this assumes list
is zero-based) and quit.
That second approach fails if some of the list elements are in [n-k,n]
.
I could try to invest something clever at this point, but what I have so far
seems sufficient for uniform distributions with values of k
much less than
n
...
n-k
, and the other the rest (this can be done in place).r
in [1,n-k]
r
is list[i]
then set r
to n-k+i
and go to step 5).r
was not altered in step 3 then we're finished.Observations:
r
is moved into the hazardous area.k
approaches n
, the maximum size of the upper (sorted) list grows.k
, if more value appear in the upper list (the bigger the sort), the chance of getting a hit in the lower list shrinks, reducing the likelihood of needing to do the sort.Refinement:
Obviously things get very sorty for large k
, but in such cases the list has comparatively few holes into which r
is allowed to settle. This could surely be exploited.
I might suggest something different if many random values with the same list and limits were needed. I hope that the list of illegal values is not the list of results of previous calls to this function, because if it is then you wouldn't want any of this -- instead you would want a Fisher-Yates shuffle.
Rejection sampling would be the simplest if possible as described already. However, if you didn't want use that, you could convert the range and disallowed values to sets and find the difference. Then, you could choose a random value out of there.
Assuming you wanted the range to be in [1,n] but not in [i,j] and that you wanted them uniformly distributed.
In Python
total = range(1,n+1)
disallowed = range(i,j+1)
allowed = list( set(total) - set(disallowed) )
return allowed[random.randrange(len(allowed))]
(Note that this is not EXACTLY uniform since in all likeliness, max_rand%len(allowed) != 0
but this will in most practical applications be very close)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With