How to generate a sequence of numbers, which would have a specific correlation (for example 0.56) and would consist of.. say 50 numbers with R program? Ty.
We can use mvrnorm()
from package MASS
require(MASS)
out <- mvrnorm(50, mu = c(0,0), Sigma = matrix(c(1,0.56,0.56,1), ncol = 2),
empirical = TRUE)
which gives
> cor(out)
[,1] [,2]
[1,] 1.00 0.56
[2,] 0.56 1.00
The empirical = TRUE
bit is important otherwise the actual correlation achieved is subject to randomness too and will not be exactly the stated value with larger discrepancies for smaller samples.
For this one you can use the arima.sim()
function:
> arima.sim(list(ar = 0.56), n = 50)
Time Series:
Start = 1
End = 50
Frequency = 1
[1] 0.62125233 -0.04742303 0.57468608 -0.07201988 -1.91416757 -1.11827563
[7] 0.15718249 0.63217365 -1.24635896 -0.22950855 -0.79918784 0.31892842
[13] 0.33335688 -1.24328177 -0.79056890 1.08443057 0.55553819 0.33460674
[19] -0.33037659 -0.65244221 0.70461755 0.61450122 0.53731454 0.19563672
[25] 1.73945110 1.27119241 0.82484460 1.58382861 1.81619212 -0.94462052
[31] -1.36024898 -0.30964390 -0.94963216 -3.75725819 -1.77342095 -1.20963799
[37] -1.76325350 -1.20556172 -0.94684678 -0.85407649 0.14922226 -0.31109945
[43] 0.39456259 0.89610859 -0.70913792 -2.27954408 -1.14722464 0.39140446
[49] 0.66376227 1.63275483
If you don't want to specify those matrices, other options are corgen
from ecodist
:
library("ecodist")
xy <- corgen(len = 50, r = 0.56, epsilon = 0.01)
Or rolling your own:
simcor <- function (n, xmean, xsd, ymean, ysd, correlation) {
x <- rnorm(n)
y <- rnorm(n)
z <- correlation * scale(x)[,1] + sqrt(1 - correlation^2) *
scale(resid(lm(y ~ x)))[,1]
xresult <- xmean + xsd * scale(x)[,1]
yresult <- ymean + ysd * z
data.frame(x=xresult,y=yresult)
}
Test
> r <- simcor(n = 50, xmean = 12, ymean = 30, xsd = 3, ysd = 4, correlation = 0.56)
> cor(r$x,r$y)
[1] 0.56
Use rmvnorm
from the mvtnorm
package to sample from the multivariate normal distribution. For example for correlation of 0.56:
library("mvtnorm")
foo <- rmvnorm(10000,c(0,0),matrix(c(1,0.56,0.56,1),2,2))
Test:
> cor(foo[,1],foo[,2])
[1] 0.5611207
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With