Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate rolling correlation using rollapply

Tags:

r

zoo

I have zoo object with 10000+ rows.

> head(tt)
                      A             B
2007-01-04  0.005945924  0.0021167475
2007-01-05 -0.004201991 -0.0080020024
2007-01-08  0.001740897  0.0045804104
2007-01-09  0.000000000 -0.0008163931
2007-01-10 -0.004503531  0.0032615812
2007-01-11 -0.005841138  0.0043863282

I have tried variations of the following line, but to no avail.

rollapply(tt, 21, function(x) cor(x[,1],x[,2]))

Every entry gave correlation of 1, looks like it's picking up the 1 off the diagonal of the correlation matrix.

2013-11-25  1  1
2013-11-26  1  1
2013-11-27  1  1
2013-11-29  1  1
2013-12-02  1  1
2013-12-03  1  1

What I really want is -0.4649, like the following

> cor(tt)
           A          B
A  1.0000000 -0.4649881
B -0.4649881  1.0000000
like image 780
simon Avatar asked Dec 19 '13 20:12

simon


2 Answers

For your simple case, you could use TTR::runCor.

set.seed(21)
x <- rnorm(30)
y <- rnorm(30)
z <- zoo(cbind(x,y),Sys.Date()-1:30)
tail(rollapplyr(z, 21, function(x) cor(x[,1],x[,2]), by.column=FALSE))
tail(runCor(z[,1],z[,2],21))
like image 177
Joshua Ulrich Avatar answered Sep 20 '22 10:09

Joshua Ulrich


Try something like this:

x<-rnorm(100)
y<-rnorm(100)
rollapply(data.frame(x,y), 21 ,function(x) cor(x[,1],x[,2]), by.column=FALSE)

In other words, I think you may just need the by.column=FALSE argument. Works with a zoo object too

rollapply(zoo(cbind(x,y),Sys.Date()-1:100), 21 ,function(x) cor(x[,1],x[,2]), by.column=FALSE)

Edit to address a question from the comment about adding another column.

You can specify the columns you want to use in the cor function.

z<-rnorm(100)
rollapply(zoo(cbind(x,y,z),Sys.Date()-1:100), 21 ,function(x) cor(x[,1],x[,3]), by.column=FALSE)
rollapply(zoo(cbind(x,y,z),Sys.Date()-1:100), 21 ,function(x) cor(x[,2],x[,3]), by.column=FALSE)

by.column=FALSE indicates that the function should not be applied to each column separately. If by.column=TRUE, then the function will be applied to each column separately, and this is the default behavior.

like image 30
Jota Avatar answered Sep 22 '22 10:09

Jota