I have a set of integers each of which has a probability assigned, derived from earlier experiments, e.g.: <pre class="prettyprint"><code>0 = 0.5 1 = 0.2 2 = 0.3 </code></pre> Complying with the specifications of a probability distribution, these weights sum up to 1.0. I am now looking for an efficient way to sample one of the values while taking the given probabilities into account, e.g. (pseude-code): <pre class="prettyprint"><code>Distribution distribution = new DiscreteDistribution(new double[]{0.5, 0.3, 0.2}); distribution.sample(); </code></pre> This should result in 0 half of the time according to the given numbers. However, do not assume any patterns or regularities among these. I've been using Apache Commons Math for my previous experiments, but it does not seem to provide a solution for this scenario, neither does Colt. I wonder whether this is because I've missed an easy solution. A naive implemententation seems more or less straight-forward, but doing this efficiently is rather involved. That is why I am looking for an established implementation.

Calling <code>Random.nextDouble()</code> is a fairly expensive operation. You are better off using <code>Random.nextInt(n)</code> in this case <pre class="prettyprint"><code>int num = rand.nextInt(10); return num <= 5 ? 0 : num <= 8 ? 1 : 2; </code></pre>

Discrete Probability Distribution in Java

Tags:

java

math

probability

I have a set of integers each of which has a probability assigned, derived from earlier experiments, e.g.:

0 = 0.5
1 = 0.2
2 = 0.3

Complying with the specifications of a probability distribution, these weights sum up to 1.0. I am now looking for an efficient way to sample one of the values while taking the given probabilities into account, e.g. (pseude-code):

Distribution distribution = new DiscreteDistribution(new double[]{0.5, 0.3, 0.2});
distribution.sample();

This should result in 0 half of the time according to the given numbers. However, do not assume any patterns or regularities among these.

I've been using Apache Commons Math for my previous experiments, but it does not seem to provide a solution for this scenario, neither does Colt.

I wonder whether this is because I've missed an easy solution. A naive implemententation seems more or less straight-forward, but doing this efficiently is rather involved. That is why I am looking for an established implementation.

784

asked Feb 29 '16 13:02

Carsten

3 Answers

Given the simplicity of the quantile function and the triviality of a manual implementation, I don't see any harm in writing this out explicitly.

Once you've drawn your random number r in [0, 1), use

if (r <= 0.5/*micro-optimisation: most likely case first*/){
    return 0;
} else if (r <= 0.8/*then the next most likely case*/){
    return 2;
} else {
    return 1;
}

Perhaps things get a little more fancy for more than 3 numbers, consider building up a table to represent the quantile function in such cases, at the expense of some degradation in performance.

(It would be difficult to beat my solution in terms of speed, in the worst case you have a couple of branches - and you're helping a branch predictor in the nicest way you possibly can, and the random number drawing will be where the performance bottleneck is).

answered Oct 24 '22 06:10

Bathsheba

A very simple generic solution would be:

class Distribution<T>{
    List<Double> probs = new ArrayList<>();
    List<T> events = new ArrayList<>();
    double sumProb;
    Random rand = new Random();

    Distribution(Map<T,Double> probs){
        for(T event : probs.keySet()){
            sumProb += probs.get(event);
            events.add(event);
            this.probs.add(probs.get(event));
        }
    }

    public T sample(){
        T value;
        double prob = rand.nextDouble()*sumProb;
        int i;
        for(i=0; prob>0; i++){
            prob-= probs.get(i);
        }
        return events.get(i-1);
    }
}

Feel free to change it, as you need it, e.g. with adding other constructors. Of course here is a lot of stuff to improve, starting with the efficiency, but it is something you can reuse later a lot.

answered Oct 24 '22 05:10

ctst

Calling Random.nextDouble() is a fairly expensive operation. You are better off using Random.nextInt(n) in this case

int num = rand.nextInt(10);
return num <= 5 ? 0 : num <= 8 ? 1 : 2;

answered Oct 24 '22 07:10

Peter Lawrey

Related questions
                            
                                How to get all "li" elements of "ul" class in Selenium WebDriver
                            
                                ScriptEngine clear and dispose
                            
                                WatchService - incorrectly resolved absolute path
                            
                                How to add stereo,treble options in audio equalizer?
                            
                                Android Studio v7- Import errors for few classes (Cannot resolve symbol)
                            
                                Why is some output in Eclipse console - red?
                            
                                Spring Inheritance - Annotation
                            
                                Java Process with concurrent Input/Output Streams
                            
                                Warning given for -fx-alignment: LEFT saying no enum constant javafx.geometry.Pos.LEFT
                            
                                MongoRepository findByCreatedAtBetween not returning accurate results
                            
                                Iterate large set of files in a directory
                            
                                Hibernate_sequence does not exist with hibernate envers
                            
                                @CrossOrigin annotation stops compilation in IntelliJ
                            
                                Tomcat: HikariCP issue when deploying two applications with DB connection
                            
                                Node.setDisable() vs setDisabled() in JavaFX
                            
                                Is it possible for using different pattern layout for different log level and output to SYSTEM_OUT?
                            
                                Custom Control ClassNotFoundException in Scene Builder
                            
                                Setting up SSL in Dropwizard
                            
                                how to escape comma in CSV file
                            
                                Gradle error after including facebook sdk

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With