I want to generate one random combination out of all possible combinations_with_replacement
. The tricky bit is that I want each of the possible outcomes to have the same probability without needing to generate (not even implicit) all the possible outcomes.
For example:
import itertools
import random
random.choice(list(itertools.combinations_with_replacement(range(4), 2)))
That approach is way too slow (and memory expensive) because it needs to create all possible combinations whereas I only want one.
It's not so bad if I determine how many combinations_with_replacement
there will be and use random.randrange
together with next
and itertools.islice
on the itertools.combinations_with_replacement
. That doesn't need to generate all possible combinations (except in the worst-case). But it's still too slow.
On the other hand the recipe mentioned in the itertools
documentation is fast but not each combination has the same probability.
Well, I'm in a bit of a dilemma, because I've found an algorithm that works, but I don't know why. So do what you want of if, maybe some mathematician in the room can work out the probabilities, but it does empirically work. The idea is to pick one element at a time, increasing the probability of the selected elements. I suspect the reasoning must be similar to that of reservoir sampling, but I didn't work it out.
from random import choice
from itertools import combinations_with_replacement
population = ["A", "B", "C", "D"]
k = 3
def random_comb(population, k):
idx = []
indices = list(range(len(population)))
for _ in range(k):
idx.append(choice(indices))
indices.append(idx[-1])
return tuple(population[i] for i in sorted(idx))
combs = list(combinations_with_replacement(population, k))
counts = {c: 0 for c in combs}
for _ in range(100000):
counts[random_comb(population, k)] += 1
for comb, count in sorted(counts.items()):
print("".join(comb), count)
The output is the number of times each possibility has appeared after 100,000 runs:
AAA 4913
AAB 4917
AAC 5132
AAD 4966
ABB 5027
ABC 4956
ABD 4959
ACC 5022
ACD 5088
ADD 4985
BBB 5060
BBC 5070
BBD 5056
BCC 4897
BCD 5049
BDD 5059
CCC 5024
CCD 5032
CDD 4859
DDD 4929
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With