Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to do weighted random sample of categories in python

Given a list of tuples where each tuple consists of a probability and an item I'd like to sample an item according to its probability. For example, give the list [ (.3, 'a'), (.4, 'b'), (.3, 'c')] I'd like to sample 'b' 40% of the time.

What's the canonical way of doing this in python?

I've looked at the random module which doesn't seem to have an appropriate function and at numpy.random which although it has a multinomial function doesn't seem to return the results in a nice form for this problem. I'm basically looking for something like mnrnd in matlab.

Many thanks.

Thanks for all the answers so quickly. To clarify, I'm not looking for explanations of how to write a sampling scheme, but rather to be pointed to an easy way to sample from a multinomial distribution given a set of objects and weights, or to be told that no such function exists in a standard library and so one should write one's own.

like image 661
John Avatar asked Jun 21 '11 21:06

John


People also ask

How do I make a weighted random number in Python?

Use the random. choices() Function to Generate Weighted Random Choices. Here, the random module of Python is used to make random numbers. In the choices() function, weighted random choices are made with a replacement.

What is a weighted random sample?

In weighted random sampling (WRS) the items are weighted and the probability of each item to be selected is determined by its relative weight.


1 Answers

This might do what you want:

numpy.array([.3,.4,.3]).cumsum().searchsorted(numpy.random.sample(5)) 
like image 129
sholte Avatar answered Sep 20 '22 18:09

sholte