So I am running a fixed effects model using the plm
package in R, and I am wondering how I can compare which of two models are more suitable.
For example, here is the code for two models I have constructed:
library(plm)
eurofix <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+fx+ld+euro+core,
data=euro,
model="within")
eurofix2 <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+ld+euro+core,
data=euro,
model="within")
I know that with a regular lm
call, I can compare two models by running an anova test, but that does not seem to work in this case. I always get the following error:
Error in UseMethod("anova") :
no applicable method for 'anova' applied to an object of class "c('plm', 'panelmodel')"
Does anybody know what to do with the plm
package? Is the Wald Test appropriate?
To compare the fits of two models, you can use the anova() function with the regression objects as two separate arguments. The anova() function will take the model objects as arguments, and return an ANOVA testing whether the more complex model is significantly better at capturing the data than the simpler model.
plm is a package for R which intends to make the estimation of linear panel models straightforward. plm provides functions to estimate a wide variety of models and to make (robust) inference. Details For a gentle and comprehensive introduction to the package, please see the package's vignette.
PLM Objects are rule sets, checks, and rules that can be created independently from a model and can be loaded and applied to any model.
The following code answered a similar question in Cross Validated The question there is also about test (joint) hypothesis in plm
routine. It should be straightforward to apply the codes to your question.
library(plm) # Use plm
library(car) # Use F-test in command linearHypothesis
library(tidyverse)
data(egsingle, package = 'mlmRev')
dta <- egsingle %>% mutate(Female = recode(female, .default = 0L, `Female` = 1L))
plm1 <- plm(math ~ Female * (year), data = dta, index = c('childid', 'year', 'schoolid'), model = 'within')
# Output from `summary(plm1)` --- I deleted a few lines to save space.
# Coefficients:
# Estimate Std. Error t-value Pr(>|t|)
# year-1.5 0.8842 0.1008 8.77 <2e-16 ***
# year-0.5 1.8821 0.1007 18.70 <2e-16 ***
# year0.5 2.5626 0.1011 25.36 <2e-16 ***
# year1.5 3.1680 0.1016 31.18 <2e-16 ***
# year2.5 3.9841 0.1022 38.98 <2e-16 ***
# Female:year-1.5 -0.0918 0.1248 -0.74 0.46
# Female:year-0.5 -0.0773 0.1246 -0.62 0.53
# Female:year0.5 -0.0517 0.1255 -0.41 0.68
# Female:year1.5 -0.1265 0.1265 -1.00 0.32
# Female:year2.5 -0.1465 0.1275 -1.15 0.25
# ---
xnames <- names(coef(plm1)) # a vector of all independent variables' names in 'plm1'
# Use 'grepl' to construct a vector of logic value that is TRUE if the variable
# name starts with 'Female:' at the beginning. This is generic, to pick up
# every variable that starts with 'year' at the beginning, just write
# 'grepl('^year+', xnames)'.
picked <- grepl('^Female:+', xnames)
linearHypothesis(plm1, xnames[picked])
# Hypothesis:
# Female:year - 1.5 = 0
# Female:year - 0.5 = 0
# Female:year0.5 = 0
# Female:year1.5 = 0
# Female:year2.5 = 0
#
# Model 1: restricted model
# Model 2: math ~ Female * (year)
#
# Res.Df Df Chisq Pr(>Chisq)
# 1 5504
# 2 5499 5 6.15 0.29
I struggled with this too, but finally came up with the following solution (with the help of a PhD friend). Using your example, see a sample solution below.
Use the AIC Criteria to compare panel models as follows:
library(plm)
eurofix <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+fx+ld+euro+core,
data=euro,
model="within")
eurofix2 <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+ld+euro+core,
data=euro,
model="within")
# AIC = log(RSS/N) + 2K/N for linear models
# AIC = log(RSS/n) + 2K/n for panel models
Sum1 <- summary(eurofix)
RSS1 <- sum(Sum1$residuals^2)
K1 <- max(eurofix$assign)
N1 <- length(eurofix$residuals)
n1 <- N1 - K1 - eurofix$df.residual
AIC_eurofix = log(RSS1/n1) + (2*K1)/n1
Sum2 <- summary(eurofix2)
RSS2 <- sum(Sum2$residuals^2)
K2 <- max(eurofix2$assign)
N2 <- length(eurofix2$residuals)
n2 <- N2 - K2 - eurofix2$df.residual
AIC_eurofix2 = log(RSS2/n2) + (2*K2)/n2
The lower AIC value is the preferred model!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With