Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate correlated binary variables

I need to generate a series of N random binary variables with a given correlation function. Let x = {xi} be a series of binary variables (taking the value 0 or 1, i running from 1 to N). The marginal probability is given Pr(xi = 1) = p, and the variables should be correlated in the following way:

Corr[ xixj ] = const × |ij|−α (for i!=j)

where α is a positive number.

If it is easier, consider the correlation function:

Corr[ xixj ] = (|ij|+1)−α

The essential part is that I want to investigate the behavior when the correlation function goes like a power law. (not α|ij| )

Is it possible to generate a series like this, preferably in Python?

like image 351
jonalm Avatar asked Mar 14 '10 07:03

jonalm


People also ask

Can you do a correlation with binary data?

Recall that binary variables are variables that can only take on one of two possible values. The value for tetrachoric correlation ranges from -1 to 1 where -1 indicates a strong negative correlation, 0 indicates no correlation, and 1 indicates a strong positive correlation.

Can random variables be correlated?

If the random variables are correlated then this should yield a better result, on the average, than just guessing. We are encouraged to select a linear rule when we note that the sample points tend to fall about a sloping line.


2 Answers

Thanks for all your inputs. I found an answer to my question in the cute little article by Chul Gyu Park et al., so in case anyone run into the same problem, look up:

"A simple method for Generating Correlated Binary Variates" (jstor.org.stable/2684925)

for a simple algorithm. The algorithm works if all the elements in the correlation matrix are positive, and for a general marginal distribution Pr(x_i)=p_i.

j

like image 138
jonalm Avatar answered Oct 02 '22 12:10

jonalm


You're describing a random process, and it looks like a tough one to me... if you eliminated the binary (0,1) requirement, and instead specified the expected value and variance, it would be possible to describe this as a white noise generator feeding through a 1-pole low-pass filter, which I think would give you the α|i-j| characteristic.

This actually might meet the bar for mathoverflow.net, depending on how it is phrased. Let me try asking....


update: I did ask on mathoverflow.net for the α|i-j| case. But perhaps there are some ideas there that can be adapted to your case.

like image 45
Jason S Avatar answered Oct 02 '22 14:10

Jason S