Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ruby - Pick one element from array by possibility

Tags:

arrays

ruby

I have an array with 3 elements and I want to pick one and add that into another array base on possibility.

For example, num 1 has 5% chance to be picked, num 2 has 60% chance to be picked and num 3 has 35% chance to be picked.

arr = [{:num=>1, :diff=>-29}, {:num=>2, :diff=>5}, {:num=>3, :diff=>25}]

I found below methods from stackoverflow, just wondering if this would work? Or there is another way to do it?

def get_num(arr)
    case rand(100) + 1
    when 1..5
        p arr[0]
    when 6..65
        p arr[1]
    when 66..100
        p arr[2]
    end
end

get_num(arr)


Thanks!

like image 746
Rhona Avatar asked Jan 21 '26 22:01

Rhona


1 Answers

Your code is fine but here are two other approaches.

Use a cumulative distribution function ("CDF")

CDF = [[0.05,0], [0.05+0.60,1], [0.5+0.60+0.35,2]]
  #=> [[0.05,0], [0.65,1], [1.0,2]]
def get_num(arr)
  n = rand
  arr[CDF.find { |mx,_idx| n <= mx }.last]
end
arr = [{:num=>1, :diff=>-29}, {:num=>2, :diff=>5}, {:num=>3, :diff=>25}]
get_num(arr)
  #=> {:num=>2, :diff=>5}
get_num(arr)
  #=> {:num=>2, :diff=>5}
get_num(arr)
  #=> {:num=>3, :diff=>25}
get_num(arr)
  #=> {:num=>1, :diff=>-29}
get_num(arr)
  #=> {:num=>2, :diff=>5}

Suppose:

n = rand
  #=> 0.5385005480168696

then

a = CDF.find { |mx,_idx| n <= mx }
  #=> [0.65,1]
i = a.last
  #=> 1
arr[i]
  #=> {:num=>2, :diff=>5}

Note that I've followed the convention of beginning the name of find's second block variable (_idx) with an underscore to signal to the reader that that block variable is not used in the block calculation. Often just an underscore (_) is used.

Now consider the fraction of times each element of arr will be randomly-drawn if n draws are made:

def outcome_fractions(arr, n)
  n.times
   .with_object(Hash.new(0)) { |_,h| h[get_num(arr)] += 1 }
   .transform_values { |v| v.fdiv(n) }
end

Randomly select from an array of indices

outcome_fractions(arr, 1_000)
  #=> {{:num=>2, :diff=>5}  =>0.612,
  #    {:num=>3, :diff=>25} =>0.328,
  #    {:num=>1, :diff=>-29}=>0.06}
outcome_fractions(arr, 100_000)
  #=> {{:num=>3, :diff=>25} =>0.34818,
  #    {:num=>1, :diff=>-29}=>0.04958,
  #    {:num=>2, :diff=>5}  =>0.60224}

Notice that the fraction of each hash that is randomly drawn approaches its specified population probability as the sample size is increased (though the "pseudo-random" draws are not truly random).

Do not be concerned with how outcome_fractions works.


Here is another way that is more efficient (because it does not use find, which performs a linear search) but uses more memory.

CHOICE = [*[0]*5, *[1]*60, *[2]*35]
  #=> [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  #    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  #    1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
  #    1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
  #    2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
  #    2, 2, 2, 2, 2]
def get_num(arr)
  arr[CHOICE[rand(100)]]
end
  #=> {{:num=>2, :diff=>5} =>0.60029,
  #    {:num=>3, :diff=>25}=>0.35022,
  #    {:num=>1, :diff=>-29}=>0.04949}

Note that:

[*[0]*5, *[1]*60, *[2]*35]

produces the same array as

[[0]*5, [1]*60, [2]*35].flatten

The first * in *[0]*5 is the splat operator; the second is the method Array#*. [0]*5 #=> [0,0,0,0,0] is evaluated first.

CHOICE has 100 elements. If the three probabilities were, say, 0.048, 0.604 and 0.348, CHOICE would have 10**3 #=> 1_000 elements (48 zeros, 604 ones and 348 twos).

like image 167
Cary Swoveland Avatar answered Jan 23 '26 20:01

Cary Swoveland