Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Randomly choosing from a list with weighted probabilities

I have an array of N elements (representing the N letters of a given alphabet), and each cell of the array holds an integer value, that integer value meaning the number of occurrences in a given text of that letter. Now I want to randomly choose a letter from all of the letters in the alphabet, based on his number of appearances with the given constraints:

  • If the letter has a positive (nonzero) value, then it can be always chosen by the algorithm (with a bigger or smaller probability, of course).

  • If a letter A has a higher value than a letter B, then it has to be more likely to be chosen by the algorithm.

Now, taking that into account, I've come up with a simple algorithm that might do the job, but I was just wondering if there was a better thing to do. This seems to be quite fundamental, and I think there might be more clever things to do in order to accomplish this more efficiently. This is the algorithm i thought:

  • Add up all the frequencies in the array. Store it in SUM
  • Choosing up a random value from 0 to SUM. Store it in RAN
  • [While] RAN > 0, Starting from the first, visit each cell in the array (in order), and subtract the value of that cell from RAN
  • The last visited cell is the chosen one

So, is there a better thing to do than this? Am I missing something?

I'm aware most modern computers can compute this so fast I won't even notice if my algorithm is inefficient, so this is more of a theoretical question rather than a practical one.

I prefer an explained algorithm rather than just code for an answer, but If you're more comfortable providing your answer in code, I have no problem with that.

like image 493
Setzer22 Avatar asked Jun 22 '13 12:06

Setzer22


People also ask

How do you do weighted random selection?

For random selection with particular weights, the following technique can then be used: Generate a random number x between 0 and s u m ( w ) − 1 sum(w)-1 sum(w)−1. Find the smallest index that corresponds to the prefix sum greater than the randomly chosen number.

How do you choose elements from a list with different probabilities?

Relative weights to choose elements from the list with different probability. First, define the probability for each element. If you specified the probability using the relative weight, the selections are made according to the relative weights. You can set relative weights using the weight parameter.

How do you randomly select from a list?

Using random. randrange() to select random value from a list. random. randrange() method is used to generate a random number in a given range, we can specify the range to be 0 to the length of the list, and get the index, and then the corresponding value.

How do you select elements from the list with different probability using Numpy?

random. choice() method to choose elements from the list with different probability. Output: Return the numpy array of random samples. Note: parameter p is probabilities associated with each entry in a(1d-array).


1 Answers

The idea:

  • Iterate through all the elements and set the value of each element as the cumulative frequency thus far.
  • Generate a random number between 1 and the sum of all frequencies
  • Do a binary search on the values for this number (finding the first value greater than or equal to the number).

Example:

Element    A B C D
Frequency  1 4 3 2
Cumulative 1 5 8 10

Generate a random number in the range 1-10 (1+4+3+2 = 10, the same as the last value in the cumulative list), do a binary search, which will return values as follows:

Number   Element returned
1        A
2        B
3        B
4        B
5        B
6        C
7        C
8        C
9        D
10       D
like image 127
Bernhard Barker Avatar answered Sep 28 '22 08:09

Bernhard Barker