Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

the rolling regression in R using roll apply

My imported data contains 7 variables: Y and X1, X2, X3, X4, X5, X6. I tried applying the rollapply function in zoo in order to run a rolling regression within an in-sample with a window of 262 obs. (work days in a year).

         date             Y            X1            X2
1     10/1/07 -0.0080321720  4.690734e-03  3.333770e-03
2     10/2/07  0.0000000000 -2.818413e-03  5.418223e-03
3     10/3/07  0.0023158650 -4.178744e-03 -3.821100e-04
4     10/4/07 -0.0057491710 -5.071030e-03 -8.321550e-04
5     10/5/07  0.0073570500  3.065045e-03  5.179574e-03
6     10/8/07  0.0127708010 -7.278513e-03  1.145395e-03
7     10/9/07  0.0032661980  9.692267e-03  6.514035e-03
8    10/10/07  0.0013824430  1.161780e-04  2.676416e-03
9    10/11/07  0.0026607550  1.113179e-02  8.825719e-03
10   10/12/07 -0.0046362600 -2.453561e-03 -6.584070e-03
11   10/15/07 -0.0023757680 -7.829081e-03 -3.070540e-03
12   10/16/07 -0.0128673660 -4.619378e-03 -8.972126e-03
13   10/17/07  0.0016049760  1.276695e-03  5.349316e-03
14   10/18/07 -0.0044198970 -9.018499e-03 -1.215895e-02
15   10/19/07 -0.0011080330 -5.328661e-03 -7.131916e-03
16   10/22/07 -0.0024217970 -2.019539e-02 -2.021072e-02
17   10/23/07  0.0031270520  1.668604e-02  2.236130e-02
18   10/24/07 -0.0040367400 -1.061433e-02 -5.735703e-03
19   10/25/07  0.0001011170  1.346312e-02  1.036109e-02
20   10/26/07  0.0003032910  3.766526e-03  2.903628e-03
21   10/29/07  0.0004042450  1.416406e-02  2.527754e-03
22   10/30/07 -0.0012132240 -1.387166e-03 -8.202236e-03
23   10/31/07  0.0057497510  9.593904e-03  1.433401e-02
24    11/1/07 -0.0032238590 -1.648975e-02 -1.029199e-02
25    11/2/07 -0.0031330560 -7.737784e-03 -7.559498e-03
26    11/5/07 -0.0001012300 -7.877763e-03 -8.500554e-03
27    11/6/07 -0.0004050220  7.407770e-03  2.536320e-03
28    11/7/07 -0.0031444970 -5.904219e-03 -8.026064e-03
29    11/8/07 -0.0045822590 -3.712574e-03 -6.395584e-03
30    11/9/07  0.0016316540 -1.432552e-02 -1.741458e-02
31   11/12/07 -0.0019378860 -3.926583e-03 -4.543370e-03
32   11/13/07  0.0011223920 -1.952799e-03 -2.622112e-03
33   11/14/07  0.0008154940  8.687550e-06  1.085682e-03
34   11/15/07  0.0015272620 -1.549745e-02 -1.556172e-02
35   11/16/07 -0.0001017450 -5.578556e-03 -1.432244e-02
36   11/19/07  0.0014234880 -2.206707e-02 -3.537936e-02
37   11/20/07 -0.0010165700  1.643937e-02  5.140822e-03
38   11/21/07 -0.0008140010 -1.715961e-02 -2.756704e-02
39   11/22/07 -0.0008146640 -2.108098e-03  7.455698e-03
40   11/23/07  0.0008146640  1.266776e-02  1.615338e-02
41   11/26/07  0.0008140010  5.539814e-03  2.854080e-03
42   11/27/07  0.0006100660 -8.561106e-03 -9.720505e-03
43   11/28/07 -0.0015258640  3.392103e-02  2.132374e-02
44   11/29/07 -0.0006109980  6.109848e-03  1.045556e-02
45   11/30/07  0.0004073730  9.214342e-03  1.133690e-02
46    12/3/07 -0.0002036660 -7.006415e-03 -6.079820e-04
47    12/4/07  0.0002036660 -1.187605e-02 -2.554853e-02
48    12/5/07  0.0007125040  1.362121e-02  9.525618e-03
49    12/6/07 -0.0034655010  7.917348e-03  5.252105e-03
50    12/7/07  0.0018361730 -1.026832e-02  1.216898e-02
51   12/10/07  0.0013240310  3.347302e-03  1.143687e-02
52   12/11/07  0.0005087760 -3.433720e-03  2.373558e-03
53   12/12/07  0.0024385300  5.507930e-04  3.191504e-03
54   12/13/07 -0.0115336820 -1.793698e-02 -2.149447e-02
55   12/14/07 -0.0010271160 -2.307745e-03 -1.038483e-03
56   12/17/07 -0.0033969870 -1.822079e-02 -2.920662e-02
57   12/18/07  0.0000000000 -1.873297e-03 -7.061215e-03
58   12/19/07 -0.0004125410 -3.372400e-06 -7.879850e-03
59   12/20/07  0.0008249120 -6.227957e-03 -1.752460e-04
60   12/21/07 -0.0020635580  1.734991e-02  1.348190e-02
61   12/24/07  0.0003098050  0.000000e+00  0.000000e+00
62   12/25/07  0.0000000000  0.000000e+00  0.000000e+00
63   12/26/07  0.0001032470  0.000000e+00  0.000000e+00
64   12/27/07  0.0006192590  5.006783e-03  5.274480e-03
65   12/28/07 -0.0005160230  6.428153e-03  8.557260e-03
66   12/31/07  0.0000000000  0.000000e+00  0.000000e+00
67     1/1/08  0.0002064410  0.000000e+00  0.000000e+00
68     1/2/08 -0.0009293200 -6.023384e-03 -3.104400e-03
69     1/3/08  0.0027853730 -2.302511e-03 -2.759650e-03
70     1/4/08  0.0018526150 -2.149450e-02 -2.645257e-02
71     1/7/08 -0.0005142710 -4.445206e-03 -2.117698e-

1596          <NA>         <NA>          <NA>             

the last line for some reason doesn't show the values,even though there are in the original excel file(X3,X4,X5,X6 are missing,since the columns are stacked on top of each other,I copied the batch from the top for the example.

My code is:

rollapply(ts, 262, lm(
          Y~X1+X2+X3+X4+X5+X6+0, subset=1:floor(length(x)/2)), 
          align="right")

The error message I get is:

Error in eval(expr, envir, enclos) : object 'Y' not found

I really wonder why it can not find the Y variable, since it is displayed in the time series dataset with the appropriate heading.

like image 401
user3546838 Avatar asked Apr 18 '14 21:04

user3546838


People also ask

What is a rolling regression?

Rolling regressions are one of the simplest models for analysing changing relationships among variables overtime. They use linear regression but allow the data set used to change over time. In most linear regression models, parameters are assumed to be time-invariant and thus should not change overtime.

How do I run rolling regression in EViews?

Go to: Help/Quick Reference/Sample Programs & Data/ then click the roll link for detailed examples. The "Roll" Add-In is a simple EViews program that is integrated into EViews, allowing you to execute the rolling regression program from a single equation object.


2 Answers

It is not really clear what your data actually is (use dput(example_data) to give reproducible examples).

But the lm call in your example is simply doing the same regression over and over again (your x is not changing) and as josilber points out, it is supposed to be a function. Here is an example where all the data is in the data.frame allRegData and it has at least two columns, one named y and another named x:

require(zoo)
rollapply(zoo(allRegData),
          width=262,
          FUN = function(Z) 
          { 
             t = lm(formula=y~x, data = as.data.frame(Z), na.rm=T); 
             return(t$coef) 
          },
          by.column=FALSE, align="right") 
like image 167
Hans Roggeman Avatar answered Oct 12 '22 11:10

Hans Roggeman


I guess your questions are

  1. that you want to apply rolling regression on 262 width window of data for roughly 6 years yielding 1572 which is close to your 1596 observations with six covariates.
  2. figure out how to solve your problem with rollapply.
  3. an issue with loading in a data set from Excel.

It seems hard to help you with 3. since you do not provide the data set or the R code you use. Starting with 1., then you can use the rollRegres package I have made. What you would do is something like this

#####
# simulate data
width  <- 262L
n_yr   <- 6L
n_covs <- 6L
set.seed(23847996)
X <- matrix(rnorm(n_yr * width), ncol = n_covs, dimnames = list(NULL, paste0("X", 1:n_covs)))
df <- data.frame(Y = rnorm(n_yr * width), X)
head(df)
#R         Y     X1     X2     X3      X4     X5      X6
#R 1 -1.1478  0.516 -1.381  0.776 -0.0992  1.254 -0.8444
#R 2 -0.0542  1.328  1.411 -1.206  0.2560  0.975  0.9534
#R 3 -0.8350 -1.402  1.190  0.591 -1.5928  0.330  1.0806
#R 4 -0.5902  0.937 -0.182  0.193  0.1575  0.217 -0.2613
#R 5  1.7891 -0.608 -1.090  0.180 -1.1765 -0.992 -0.8831
#R 6  2.0108  0.259 -0.129  0.261  1.6694 -1.822  0.0616

#####
# estimate coefs
library(rollRegres)
fits <- roll_regres(Y ~ X1 + X2 + X3 + X4 + X5 + X6, df, width = width)
tail(fits$coefs) # estimated coefs
#R      (Intercept)     X1      X2      X3     X4      X5      X6
#R 1567    0.006511 0.0557 -0.0243 0.00907 0.0234 -0.0370 -0.0553
#R 1568    0.005816 0.0569 -0.0233 0.00798 0.0248 -0.0370 -0.0560
#R 1569    0.006406 0.0566 -0.0231 0.00814 0.0243 -0.0385 -0.0555
#R 1570    0.004898 0.0519 -0.0213 0.00773 0.0323 -0.0391 -0.0588
#R 1571    0.002922 0.0532 -0.0211 0.00809 0.0307 -0.0377 -0.0609
#R 1572    0.000771 0.0538 -0.0262 0.00580 0.0309 -0.0363 -0.0658

Now, regarding 2. then you can do something like what Hans Roggeman shows but a version that works with multiple regression as you request

library(zoo)
c2 <- rollapply(
  df, width = width, function(z){
    coef(lm(Y ~ X1 + X2 + X3 + X4 + X5 + X6, as.data.frame(z)))
  }, by.column = FALSE, fill = NA_real_, align = "right")
all.equal(fits$coefs, c2, check.attributes = FALSE) # gives the same
#R [1] TRUE

It is much slower though

microbenchmark::microbenchmark(
  rollRegrs = roll_regres(Y ~ X1 + X2 + X3 + X4 + X5 + X6, df, width = width),
  rollapply = rollapply(
    df, width = width, function(z){
      coef(lm(Y ~ X1 + X2 + X3 + X4 + X5 + X6, as.data.frame(z)))
    }, by.column = FALSE, fill = NA_real_, align = "right"), times = 25)
#R Unit: milliseconds
#R       expr     min      lq    mean  median      uq     max neval
#R  rollRegrs    1.73    1.98    2.31    2.37    2.64    3.01    25
#R  rollapply 1726.74 1798.37 1881.25 1834.00 1964.62 2178.58    25

The rollapply version can be faster if you use lm.fit but it it still slower than roll_regres.

like image 30
Benjamin Christoffersen Avatar answered Oct 12 '22 11:10

Benjamin Christoffersen