I have a set of values, and an associated percentage for each: a: 70% chance b: 20% chance c: 10% chance I want to select a value (a, b, c) based on the percentage chance given. how do I approach this? my attempt so far looks like this: <pre class="prettyprint"><code>r = random.random() if r <= .7: return a elif r <= .9: return b else: return c </code></pre> I'm stuck coming up with an algorithm to handle this. How should I approach this so it can handle larger sets of values without just chaining together if-else flows. (any explanation or answers in pseudo-code are fine. a python or C# implementation would be especially helpful)

Here is a complete solution in C#: <pre class="prettyprint"><code>public class ProportionValue<T> { public double Proportion { get; set; } public T Value { get; set; } } public static class ProportionValue { public static ProportionValue<T> Create<T>(double proportion, T value) { return new ProportionValue<T> { Proportion = proportion, Value = value }; } static Random random = new Random(); public static T ChooseByRandom<T>( this IEnumerable<ProportionValue<T>> collection) { var rnd = random.NextDouble(); foreach (var item in collection) { if (rnd < item.Proportion) return item.Value; rnd -= item.Proportion; } throw new InvalidOperationException( "The proportions in the collection do not add up to 1."); } } </code></pre> Usage: <pre class="prettyprint"><code>var list = new[] { ProportionValue.Create(0.7, "a"), ProportionValue.Create(0.2, "b"), ProportionValue.Create(0.1, "c") }; // Outputs "a" with probability 0.7, etc. Console.WriteLine(list.ChooseByRandom()); </code></pre>

For Python: <pre class="prettyprint"><code>>>> import random >>> dst = 70, 20, 10 >>> vls = 'a', 'b', 'c' >>> picks = [v for v, d in zip(vls, dst) for _ in range(d)] >>> for _ in range(12): print random.choice(picks), ... a c c b a a a a a a a a >>> for _ in range(12): print random.choice(picks), ... a c a c a b b b a a a a >>> for _ in range(12): print random.choice(picks), ... a a a a c c a c a a c a >>> </code></pre> General idea: make a list where each item is repeated a number of times proportional to the probability it should have; use <code>random.choice</code> to pick one at random (uniformly), this will match your required probability distribution. Can be a bit wasteful of memory if your probabilities are expressed in peculiar ways (e.g., <code>70, 20, 10</code> makes a 100-items list where <code>7, 2, 1</code> would make a list of just 10 items with exactly the same behavior), but you could divide all the counts in the probabilities list by their greatest common factor if you think that's likely to be a big deal in your specific application scenario. Apart from memory consumption issues, this should be the fastest solution -- just one random number generation per required output result, and the fastest possible lookup from that random number, no comparisons &c. If your likely probabilities are very weird (e.g., floating point numbers that need to be matched to many, many significant digits), other approaches may be preferable;-).

selection based on percentage weighting

Tags:

python

c#

algorithm

random

I have a set of values, and an associated percentage for each:

a: 70% chance
b: 20% chance
c: 10% chance

I want to select a value (a, b, c) based on the percentage chance given.

how do I approach this?

my attempt so far looks like this:

r = random.random() if r <= .7:     return a elif r <= .9:     return b else:      return c

I'm stuck coming up with an algorithm to handle this. How should I approach this so it can handle larger sets of values without just chaining together if-else flows.

(any explanation or answers in pseudo-code are fine. a python or C# implementation would be especially helpful)

603

asked Sep 07 '10 02:09

Corey Goldberg

2 Answers

Here is a complete solution in C#:

public class ProportionValue<T> {     public double Proportion { get; set; }     public T Value { get; set; } }  public static class ProportionValue {     public static ProportionValue<T> Create<T>(double proportion, T value)     {         return new ProportionValue<T> { Proportion = proportion, Value = value };     }      static Random random = new Random();     public static T ChooseByRandom<T>(         this IEnumerable<ProportionValue<T>> collection)     {         var rnd = random.NextDouble();         foreach (var item in collection)         {             if (rnd < item.Proportion)                 return item.Value;             rnd -= item.Proportion;         }         throw new InvalidOperationException(             "The proportions in the collection do not add up to 1.");     } }

Usage:

var list = new[] {     ProportionValue.Create(0.7, "a"),     ProportionValue.Create(0.2, "b"),     ProportionValue.Create(0.1, "c") };  // Outputs "a" with probability 0.7, etc. Console.WriteLine(list.ChooseByRandom());

answered Oct 09 '22 07:10

Timwi

For Python:

>>> import random >>> dst = 70, 20, 10 >>> vls = 'a', 'b', 'c' >>> picks = [v for v, d in zip(vls, dst) for _ in range(d)] >>> for _ in range(12): print random.choice(picks), ...  a c c b a a a a a a a a >>> for _ in range(12): print random.choice(picks), ...  a c a c a b b b a a a a >>> for _ in range(12): print random.choice(picks), ...  a a a a c c a c a a c a >>>

General idea: make a list where each item is repeated a number of times proportional to the probability it should have; use random.choice to pick one at random (uniformly), this will match your required probability distribution. Can be a bit wasteful of memory if your probabilities are expressed in peculiar ways (e.g., 70, 20, 10 makes a 100-items list where 7, 2, 1 would make a list of just 10 items with exactly the same behavior), but you could divide all the counts in the probabilities list by their greatest common factor if you think that's likely to be a big deal in your specific application scenario.

Apart from memory consumption issues, this should be the fastest solution -- just one random number generation per required output result, and the fastest possible lookup from that random number, no comparisons &c. If your likely probabilities are very weird (e.g., floating point numbers that need to be matched to many, many significant digits), other approaches may be preferable;-).

answered Oct 09 '22 08:10

Alex Martelli

Related questions
                            
                                C#: New line and tab characters in strings
                            
                                C#: How to make pressing enter in a text box trigger a button, yet still allow shortcuts such as "Ctrl+A" to get through?
                            
                                Using GetHashCode for getting Enum int value
                            
                                Set TabPage Header Color
                            
                                ResolveAll not working
                            
                                How i can use the connectionString of the current website for log4Net instead of configuring [duplicate]
                            
                                Entity Framework Code First Using One column as Primary Key and another as Auto Increment Column
                            
                                How to execute Selenium Chrome WebDriver in silent mode?
                            
                                Identity password reset token is invalid
                            
                                How to do a simple XAML (WPF) conditional binding on the Visibility property
                            
                                EF 5 Migrations cannot connect to our database even though it does just fine at runtime
                            
                                LINQ ToListAsync expression with a DbSet
                            
                                How do I resolve Web API controllers using Autofac in a mixed Web API and MVC application?
                            
                                .net Core X Forwarded Proto not working
                            
                                How can the error 'Client found response content type of 'text/html'.. be interpreted
                            
                                IEnumerable<T> as return type
                            
                                ConcurrentBag<MyType> Vs List<MyType>
                            
                                DateTime.AddMonths adding only month not days
                            
                                Generics - call a method on every object in a List<T>
                            
                                MVC optimization for Session.Clear(), Session.Abandon(), Session.RemoveAll()?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With