I wish to select a random word from a list where the is a known chance for each word, for example:
Fruit with Probability
Orange 0.10 Apple 0.05 Mango 0.15 etc
How would be the best way of implementing this? The actual list I will take from is up to 100 items longs and the % do not all tally to 100 % they do fall short to account for the items that had a really low chance of occurrence. I would ideally like to take this from a CSV which is where I store this data. This is not a time critical task.
Thank you for any advice on how best to proceed.
You can pick items with weighted probabilities if you assign each item a number range proportional to its probability, pick a random number between zero and the sum of the ranges and find what item matches it. The following class does exactly that:
from random import random
class WeightedChoice(object):
def __init__(self, weights):
"""Pick items with weighted probabilities.
weights
a sequence of tuples of item and it's weight.
"""
self._total_weight = 0.
self._item_levels = []
for item, weight in weights:
self._total_weight += weight
self._item_levels.append((self._total_weight, item))
def pick(self):
pick = self._total_weight * random()
for level, item in self._item_levels:
if level >= pick:
return item
You can then load the CSV file with the csv
module and feed it to the WeightedChoice
class:
import csv
weighed_items = [(item,float(weight)) for item,weight in csv.reader(open('file.csv'))]
picker = WeightedChoice(weighed_items)
print(picker.pick())
What you want is to draw from a multinomial distribution. Assuming you have two lists of items and probabilities, and the probabilities sum to 1 (if not, just add some default value to cover the extra):
def choose(items,chances):
import random
p = chances[0]
x = random.random()
i = 0
while x > p :
i = i + 1
p = p + chances[i]
return items[i]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With