Weighted random selection from array

People also ask

What is weighted random sampling?

In weighted random sampling (WRS) the items are weighted and the probability of each item to be selected is determined by its relative weight.

What is weighted choice?

Weighted random choices mean selecting random elements from a list or an array by the probability of that element. We can assign a probability to each element and according to that element(s) will be selected. By this, we can select one or more than one element from the list, And it can be achieved in two ways.

How do you create a weighted random number?

To generated a random number, weighted with a given probability, you can use a helper table together with a formula based on the RAND and MATCH functions. Notice, we are intentionally shifting the cumulative probability down one row, so that the value in D5 is zero.

Compute the discrete cumulative density function (CDF) of your list -- or in simple terms the array of cumulative sums of the weights. Then generate a random number in the range between 0 and the sum of all weights (might be 1 in your case), do a binary search to find this random number in your discrete CDF array and get the value corresponding to this entry -- this is your weighted random number.

The algorithm is straight forward

rand_no = rand(0,1)
for each element in array 
     if(rand_num < element.probablity)
          select and break
     rand_num = rand_num - element.probability

I have found this article to be the most useful at understanding this problem fully. This stackoverflow question may also be what you're looking for.

I believe the optimal solution is to use the Alias Method (wikipedia). It requires O(n) time to initialize, O(1) time to make a selection, and O(n) memory.

Here is the algorithm for generating the result of rolling a weighted n-sided die (from here it is trivial to select an element from a length-n array) as take from this article. The author assumes you have functions for rolling a fair die (floor(random() * n)) and flipping a biased coin (random() < p).

Algorithm: Vose's Alias Method

Initialization:

Create arrays Alias and Prob, each of size n.

Create two worklists, Small and Large.

Multiply each probability by n.

For each scaled probability p_i:

If p_i < 1, add i to Small.

Otherwise (p_i ≥ 1), add i to Large.

While Small and Large are not empty: (Large might be emptied first)

Remove the first element from Small; call it l.

Remove the first element from Large; call it g.

Set Prob[l]=p_l.

Set Alias[l]=g.

Set p_g := (p_g+p_l)−1. (This is a more numerically stable option.)

If p_g<1, add g to Small.

Otherwise (p_g ≥ 1), add g to Large.

While Large is not empty:

Remove the first element from Large; call it g.

Set Prob[g] = 1.

While Small is not empty: This is only possible due to numerical instability.

Remove the first element from Small; call it l.

Set Prob[l] = 1.

Generation:

Generate a fair die roll from an n-sided die; call the side i.

Flip a biased coin that comes up heads with probability Prob[i].

If the coin comes up "heads," return i.

Otherwise, return Alias[i].

Here is an implementation in Ruby:

def weighted_rand(weights = {})
  raise 'Probabilities must sum up to 1' unless weights.values.inject(&:+) == 1.0
  raise 'Probabilities must not be negative' unless weights.values.all? { |p| p >= 0 }
  # Do more sanity checks depending on the amount of trust in the software component using this method,
  # e.g. don't allow duplicates, don't allow non-numeric values, etc.
  
  # Ignore elements with probability 0
  weights = weights.reject { |k, v| v == 0.0 }   # e.g. => {"a"=>0.4, "b"=>0.4, "c"=>0.2}

  # Accumulate probabilities and map them to a value
  u = 0.0
  ranges = weights.map { |v, p| [u += p, v] }   # e.g. => [[0.4, "a"], [0.8, "b"], [1.0, "c"]]

  # Generate a (pseudo-)random floating point number between 0.0(included) and 1.0(excluded)
  u = rand   # e.g. => 0.4651073966724186
  
  # Find the first value that has an accumulated probability greater than the random number u
  ranges.find { |p, v| p > u }.last   # e.g. => "b"
end

How to use:

weights = {'a' => 0.4, 'b' => 0.4, 'c' => 0.2, 'd' => 0.0}

weighted_rand weights

What to expect roughly:

sample = 1000.times.map { weighted_rand weights }
sample.count('a') # 396
sample.count('b') # 406
sample.count('c') # 198
sample.count('d') # 0

An example in ruby

#each element is associated with its probability
a = {1 => 0.25 ,2 => 0.5 ,3 => 0.2, 4 => 0.05}

#at some point, convert to ccumulative probability
acc = 0
a.each { |e,w| a[e] = acc+=w }

#to select an element, pick a random between 0 and 1 and find the first   
#cummulative probability that's greater than the random number
r = rand
selected = a.find{ |e,w| w>r }

p selected[0]

Related questions
                            
                                Correct way of looping through C++ arrays
                            
                                Check if optional array is empty
                            
                                Find indices of elements that match a given condition
                            
                                Pass Javascript Array -> PHP
                            
                                PHP: Split string into array, like explode with no delimiter
                            
                                Formatting floats in a numpy array [duplicate]
                            
                                C# Cast Entire Array?
                            
                                Unsetting array values in a foreach loop [duplicate]
                            
                                Join an array by a comma and a space
                            
                                How to make a copy of a 2D array in Python? [duplicate]
                            
                                Create two-dimensional arrays and access sub-arrays in Ruby
                            
                                How to get single value from this multi-dimensional PHP array
                            
                                Numpy: Creating a complex array from 2 real ones?
                            
                                Ruby: How to concatenate array of arrays into one
                            
                                PHP merge arrays with only NOT DUPLICATED values
                            
                                Select last element quickly after a .Split()
                            
                                Check if array contains all elements of another array
                            
                                Arrays.fill with multidimensional array in Java
                            
                                trim all strings in an array
                            
                                Why is numpy's einsum faster than numpy's built in functions?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Weighted random selection from array

Tags:

arrays

algorithm

random

People also ask

Algorithm: Vose's Alias Method

Initialization:

Generation:

Recent Activity

Donate For Us