Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to numerically sample from a joint, discrete, probability distribution function

I have a 2D "heat map" or PDF that I need to recreate by random sampling. I.E. I have a 2D probability density map showing starting locations. I need to randomly choose starting locations with the same probability as the original PDF.

To do this, I think I need to first find the joint CDF (cumulative density function), then choose random uniform numbers to sample the CDF. That's where I get stuck.

How do I numerically find the joint CDF of my PDF? I tried doing a cumulative sum along both dimensions, but that didn't yield the correct result. My knowledge of statistics is failing me.

EDIT The heatmap/PDF is the form of [x,y,z], where Z is the intensity or probability at each x,y point.

like image 668
gallamine Avatar asked May 26 '11 21:05

gallamine


People also ask

How do you calculate E xy from joint PDF?

To obtain E(XY), in each cell of the joint probability distribution table, we multiply each joint probability by its corresponding X and Y values: E(XY) = x1y1p(x1,y1) + x1y2p(x1,y2) + x2y1p(x2,y1) + x2y2p(x2,y2).

How do you generate a random number from a given distribution?

Perhaps the most generic way to do so is called inverse transform sampling: Generate a uniform random number in [0, 1]. Run the quantile function (also known as the inverse CDF or the PPF) on the uniform random number. The result is a random number that fits the distribution.

How do you find the expected value of a joint probability distribution?

Suppose that X and Y are jointly distributed discrete random variables with joint pmf p(x,y). If g(X,Y) is a function of these two random variables, then its expected value is given by the following: E[g(X,Y)]=∑∑(x,y)g(x,y)p(x,y).


1 Answers

You could first go over the 2D density map and for each (x,y) pair in it, find z by a lookup from the PDF. This will give you a starting point (x,y) with a probability of z. So each of the starting points have their own probability from the PDF. What you can do now, is to order the starting points, randomly pick a number and map it to some starting point.

For example, lets say you have n starting points: P1 .. Pn. With a probability of p1 .. pn (normalized or weighted probabilities, so the sum is 100%). Lets say you pick a random value p, pick P1 if p < p1, pick P2 if p1 < p < p1+p2, pick P3 if p1+p2 < p < p1+p2+p3 etc. You can look at it as a histogram over the points P1 to PN, which is the same thing as a cumulative distribution function.

like image 155
ralphtheninja Avatar answered Nov 09 '22 11:11

ralphtheninja