Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Discrete Probability Distribution in Java

I have a set of integers each of which has a probability assigned, derived from earlier experiments, e.g.:

0 = 0.5
1 = 0.2
2 = 0.3

Complying with the specifications of a probability distribution, these weights sum up to 1.0. I am now looking for an efficient way to sample one of the values while taking the given probabilities into account, e.g. (pseude-code):

Distribution distribution = new DiscreteDistribution(new double[]{0.5, 0.3, 0.2});
distribution.sample();

This should result in 0 half of the time according to the given numbers. However, do not assume any patterns or regularities among these.

I've been using Apache Commons Math for my previous experiments, but it does not seem to provide a solution for this scenario, neither does Colt.

I wonder whether this is because I've missed an easy solution. A naive implemententation seems more or less straight-forward, but doing this efficiently is rather involved. That is why I am looking for an established implementation.

like image 784
Carsten Avatar asked Feb 29 '16 13:02

Carsten


People also ask

What is discrete probability distribution example?

A discrete probability distribution counts occurrences that have countable or finite outcomes. This is in contrast to a continuous distribution, where outcomes can fall anywhere on a continuum. Common examples of discrete distribution include the binomial, Poisson, and Bernoulli distributions.

How do you calculate probability in Java?

To simulate probability in Java, the first thing we need to do is to generate random numbers. Fortunately, Java provides us with plenty of random numbers generators. Here, we drew numbers from 1 to 100. The chance for our random number to be lesser or equal to 50 is exactly 50%.

What is discrete data distribution?

A discrete distribution is a distribution of data in statistics that has discrete values. Discrete values are countable, finite, non-negative integers, such as 1, 10, 15, etc.


3 Answers

Given the simplicity of the quantile function and the triviality of a manual implementation, I don't see any harm in writing this out explicitly.

Once you've drawn your random number r in [0, 1), use

if (r <= 0.5/*micro-optimisation: most likely case first*/){
    return 0;
} else if (r <= 0.8/*then the next most likely case*/){
    return 2;
} else {
    return 1;
}

Perhaps things get a little more fancy for more than 3 numbers, consider building up a table to represent the quantile function in such cases, at the expense of some degradation in performance.

(It would be difficult to beat my solution in terms of speed, in the worst case you have a couple of branches - and you're helping a branch predictor in the nicest way you possibly can, and the random number drawing will be where the performance bottleneck is).

like image 62
Bathsheba Avatar answered Oct 24 '22 06:10

Bathsheba


A very simple generic solution would be:

class Distribution<T>{
    List<Double> probs = new ArrayList<>();
    List<T> events = new ArrayList<>();
    double sumProb;
    Random rand = new Random();

    Distribution(Map<T,Double> probs){
        for(T event : probs.keySet()){
            sumProb += probs.get(event);
            events.add(event);
            this.probs.add(probs.get(event));
        }
    }

    public T sample(){
        T value;
        double prob = rand.nextDouble()*sumProb;
        int i;
        for(i=0; prob>0; i++){
            prob-= probs.get(i);
        }
        return events.get(i-1);
    }
}

Feel free to change it, as you need it, e.g. with adding other constructors. Of course here is a lot of stuff to improve, starting with the efficiency, but it is something you can reuse later a lot.

like image 39
ctst Avatar answered Oct 24 '22 05:10

ctst


Calling Random.nextDouble() is a fairly expensive operation. You are better off using Random.nextInt(n) in this case

int num = rand.nextInt(10);
return num <= 5 ? 0 : num <= 8 ? 1 : 2;
like image 45
Peter Lawrey Avatar answered Oct 24 '22 07:10

Peter Lawrey