Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting an appropriate lag for a regression equation and how to interpret the results of VARselect

Tags:

var

r

vector

My question is two-fold.

How do I select an appropriate lag for my regression equation? I've got a dependent variable of house price, and independent variables of rent, house supply, national stock market index, mortgage rate, and house vacancy rate.

I did some reading and found that VARselect(data,lag.max=1 or 2 or 3 etc) can help me select an appropriate lag.

data is a csv file with the above variables. So the below is what I got. How am I supposed to interpret it?

> var=VARselect(data,lag.max=8)
> var
$selection
AIC(n)  HQ(n)  SC(n) FPE(n) 
     3      3      1      3 

$criteria
          1        2        3        4        5        6        7        8
AIC(n) 1.716881 1.575052 1.474927 1.543878 1.493210 1.651975 1.624066 1.773173
HQ(n)  1.807505 1.726093 1.686385 1.815752 1.825500 2.044682 2.077189 2.286712
SC(n)  1.962629 1.984634 2.048341 2.281125 2.394289 2.716887 2.852810 3.165750
FPE(n) 5.569664 4.841214 4.396341 4.741887 4.556023 5.424803 5.393498 6.451249

I guess, long in short, what I want to find out is - how much should I lag each of rent, house supply, national stock market index, mortgage rate, and house vacancy rate against house price to create a 'good enough' model.

I am open to other methods that help me find out what I should do but please help me out with the code. Thanks.

like image 320
user1944379 Avatar asked Jan 03 '13 01:01

user1944379


People also ask

How do you choose optimal lag?

Using VAR, the optimal lag is that which has the minimum value as reported by each of the criterion. That is AIC, SIC, HQ or FPE. If the criteria are showing different lags, you're at liberty to use either of them. that will most likely show up.

How do you select the lag length in VAR models?

This lag length is frequently selected using an explicit statistical criterion such as the AIC or SIC. Symmetric lag VAR models are easily estimated; since the specification of all equations of the model is the same, estimation by ordinary least squares yields efficient parameter estimates.

What are the different lag criterions we study under VAR analysis?

The six criteria are the Schwarz Information Criterion (SIC), the Hannan-Quinn Criterion (HQC), the Akaike Information Criterion (AIC), the general-to-specific sequential Likelihood Ratio test (LR), a small-sample correction to that test (SLR) proposed by Sims (1980), and the specific-to-general sequential Portmanteau ...


2 Answers

Check out the documentation for the vars package, in particular for the VARselect function (same information as ?VARselect, but formatted nicely).

What the $selection object is telling you is the total lag order selected by minimizing each of the 4 criteria (Akaike, Hannan-Quinn, Schwarz, and Final Prediction Error);

What the $criteria object tells you is the value of each criteria at the given lag (so that $criteria[3L, p], for example, tells you what the Schwarz criterion was for the pth lag specification). This may be useful if there are a lot of lags that have similar criterion values, allowing you to choose a more parsimonious specification if the minimizer has p very high, but a much lower value of p gives you a similar criterion.

Please also note that if you just run VARselect(data), it will evaluate the criterion for fitting the model jointly-- I'm not sure what you're going for, but from your question it seems like you might have wanted to evaluate the lag selection process for each of the columns in your data separately. To do so you'd have to run lapply(data, VARselect).

like image 173
MichaelChirico Avatar answered Nov 04 '22 05:11

MichaelChirico


I believe the AIC and SC tests are the most often used in practice and AIC in particular is well documented (see: Helmut Lütkepohl, New Introduction to Multiple Time Series Analysis).

The right answer is that there is no one method that is know to give the best result - that's why they are all still in the vars package, presumably.

One way to get a good idea for your own model, would be to carry out the test above for all variables/specific subsets and then see which test of the four gives consistent values. Then take this into account with the frequency of your data (daily, weekly, monthly, yearly?) and make an educated decision. If you have monthly data, then it is likely that your factors mentioned above indeed have effects up to 6 months later e.g. house supply against house prices - as houses aren't built/vacated very quickly.

In case you aren't sure where the lag information criterion comes into the VAR model - there is an input field in the function VAR from package 'vars', where you can just type AIC, SC etc.

like image 44
n1k31t4 Avatar answered Nov 04 '22 04:11

n1k31t4