Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to simulate correlated binary data with R? [duplicate]

Supposing I want 2 vectors of binary data with specified phi coefficients, how could I simulate it with R?

For example, how can I create two vectors like x and y of specified vector length with the cor efficient of 0.79

> x = c(1,  1,  0,  0,  1,  0,  1,  1,  1)
> y = c(1,  1,  0,  0,  0,  0,  1,  1,  1)
> cor(x,y)
[1] 0.7905694
like image 778
RNA Avatar asked Apr 18 '13 17:04

RNA


1 Answers

The bindata package is nice for generating binary data with this and more complicated correlation structures. (Here's a link to a working paper (warning, pdf) that lays out the theory underlying the approach taken by the package authors.)

In your case, assuming that the independent probabilities of x and y are both 0.5:

library(bindata)

## Construct a binary correlation matrix
rho <- 0.7905694
m <- matrix(c(1,rho,rho,1), ncol=2)   

## Simulate 10000 x-y pairs, and check that they have the specified
## correlation structure
x <- rmvbin(1e5, margprob = c(0.5, 0.5), bincorr = m) 
cor(x)
#           [,1]      [,2]
# [1,] 1.0000000 0.7889613
# [2,] 0.7889613 1.0000000
like image 107
Josh O'Brien Avatar answered Nov 02 '22 23:11

Josh O'Brien