My question is two-fold.
How do I select an appropriate lag for my regression equation? I've got a dependent variable of house price, and independent variables of rent, house supply, national stock market index, mortgage rate, and house vacancy rate.
I did some reading and found that VARselect(data,lag.max=1 or 2 or 3 etc)
can help me select an appropriate lag.
data
is a csv file with the above variables. So the below is what I got. How am I supposed to interpret it?
> var=VARselect(data,lag.max=8)
> var
$selection
AIC(n) HQ(n) SC(n) FPE(n)
3 3 1 3
$criteria
1 2 3 4 5 6 7 8
AIC(n) 1.716881 1.575052 1.474927 1.543878 1.493210 1.651975 1.624066 1.773173
HQ(n) 1.807505 1.726093 1.686385 1.815752 1.825500 2.044682 2.077189 2.286712
SC(n) 1.962629 1.984634 2.048341 2.281125 2.394289 2.716887 2.852810 3.165750
FPE(n) 5.569664 4.841214 4.396341 4.741887 4.556023 5.424803 5.393498 6.451249
I guess, long in short, what I want to find out is - how much should I lag each of rent, house supply, national stock market index, mortgage rate, and house vacancy rate against house price to create a 'good enough' model.
I am open to other methods that help me find out what I should do but please help me out with the code. Thanks.
Using VAR, the optimal lag is that which has the minimum value as reported by each of the criterion. That is AIC, SIC, HQ or FPE. If the criteria are showing different lags, you're at liberty to use either of them. that will most likely show up.
This lag length is frequently selected using an explicit statistical criterion such as the AIC or SIC. Symmetric lag VAR models are easily estimated; since the specification of all equations of the model is the same, estimation by ordinary least squares yields efficient parameter estimates.
The six criteria are the Schwarz Information Criterion (SIC), the Hannan-Quinn Criterion (HQC), the Akaike Information Criterion (AIC), the general-to-specific sequential Likelihood Ratio test (LR), a small-sample correction to that test (SLR) proposed by Sims (1980), and the specific-to-general sequential Portmanteau ...
Check out the documentation for the vars
package, in particular for the VARselect
function (same information as ?VARselect
, but formatted nicely).
What the $selection
object is telling you is the total lag order selected by minimizing each of the 4 criteria (Akaike, Hannan-Quinn, Schwarz, and Final Prediction Error);
What the $criteria
object tells you is the value of each criteria at the given lag (so that $criteria[3L, p]
, for example, tells you what the Schwarz criterion was for the p
th lag specification). This may be useful if there are a lot of lags that have similar criterion values, allowing you to choose a more parsimonious specification if the minimizer has p
very high, but a much lower value of p
gives you a similar criterion.
Please also note that if you just run VARselect(data)
, it will evaluate the criterion for fitting the model jointly-- I'm not sure what you're going for, but from your question it seems like you might have wanted to evaluate the lag selection process for each of the columns in your data separately. To do so you'd have to run lapply(data, VARselect)
.
I believe the AIC and SC tests are the most often used in practice and AIC in particular is well documented (see: Helmut Lütkepohl, New Introduction to Multiple Time Series Analysis).
The right answer is that there is no one method that is know to give the best result - that's why they are all still in the vars package, presumably.
One way to get a good idea for your own model, would be to carry out the test above for all variables/specific subsets and then see which test of the four gives consistent values. Then take this into account with the frequency of your data (daily, weekly, monthly, yearly?) and make an educated decision. If you have monthly data, then it is likely that your factors mentioned above indeed have effects up to 6 months later e.g. house supply against house prices - as houses aren't built/vacated very quickly.
In case you aren't sure where the lag information criterion comes into the VAR model - there is an input field in the function VAR from package 'vars', where you can just type AIC, SC etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With