Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare 2 models in R using the plm package?

Tags:

r

anova

plm

So I am running a fixed effects model using the plm package in R, and I am wondering how I can compare which of two models are more suitable.

For example, here is the code for two models I have constructed:

library(plm)

eurofix <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+fx+ld+euro+core, 
               data=euro, 
               model="within")

eurofix2 <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+ld+euro+core, 
                data=euro,
                model="within")

I know that with a regular lm call, I can compare two models by running an anova test, but that does not seem to work in this case. I always get the following error:

Error in UseMethod("anova") : 
  no applicable method for 'anova' applied to an object of class "c('plm', 'panelmodel')"

Does anybody know what to do with the plm package? Is the Wald Test appropriate?

like image 991
NuclearPenguins Avatar asked Feb 05 '15 00:02

NuclearPenguins


People also ask

How do I compare two models in R?

To compare the fits of two models, you can use the anova() function with the regression objects as two separate arguments. The anova() function will take the model objects as arguments, and return an ANOVA testing whether the more complex model is significantly better at capturing the data than the simpler model.

What is PLM package in R?

plm is a package for R which intends to make the estimation of linear panel models straightforward. plm provides functions to estimate a wide variety of models and to make (robust) inference. Details For a gentle and comprehensive introduction to the package, please see the package's vignette.

What is a PLM object?

PLM Objects are rule sets, checks, and rules that can be created independently from a model and can be loaded and applied to any model.


2 Answers

The following code answered a similar question in Cross Validated The question there is also about test (joint) hypothesis in plm routine. It should be straightforward to apply the codes to your question.

library(plm)  # Use plm
library(car)  # Use F-test in command linearHypothesis
library(tidyverse)
data(egsingle, package = 'mlmRev')
dta <- egsingle %>% mutate(Female = recode(female, .default = 0L, `Female` = 1L))
plm1 <- plm(math ~ Female * (year), data = dta, index = c('childid', 'year', 'schoolid'), model = 'within')

# Output from `summary(plm1)` --- I deleted a few lines to save space.
# Coefficients:
#                 Estimate Std. Error t-value Pr(>|t|)    
# year-1.5          0.8842     0.1008    8.77   <2e-16 ***
# year-0.5          1.8821     0.1007   18.70   <2e-16 ***
# year0.5           2.5626     0.1011   25.36   <2e-16 ***
# year1.5           3.1680     0.1016   31.18   <2e-16 ***
# year2.5           3.9841     0.1022   38.98   <2e-16 ***
# Female:year-1.5  -0.0918     0.1248   -0.74     0.46    
# Female:year-0.5  -0.0773     0.1246   -0.62     0.53    
# Female:year0.5   -0.0517     0.1255   -0.41     0.68    
# Female:year1.5   -0.1265     0.1265   -1.00     0.32    
# Female:year2.5   -0.1465     0.1275   -1.15     0.25    
# ---

xnames <- names(coef(plm1)) # a vector of all independent variables' names in 'plm1'
# Use 'grepl' to construct a vector of logic value that is TRUE if the variable
# name starts with 'Female:' at the beginning. This is generic, to pick up
# every variable that starts with 'year' at the beginning, just write
# 'grepl('^year+', xnames)'.
picked <- grepl('^Female:+', xnames)
linearHypothesis(plm1, xnames[picked])

# Hypothesis:
# Female:year - 1.5 = 0
# Female:year - 0.5 = 0
# Female:year0.5 = 0
# Female:year1.5 = 0
# Female:year2.5 = 0
# 
# Model 1: restricted model
# Model 2: math ~ Female * (year)
# 
#   Res.Df Df Chisq Pr(>Chisq)
# 1   5504                    
# 2   5499  5  6.15       0.29
like image 151
semibruin Avatar answered Nov 15 '22 00:11

semibruin


I struggled with this too, but finally came up with the following solution (with the help of a PhD friend). Using your example, see a sample solution below.

Use the AIC Criteria to compare panel models as follows:

library(plm)

eurofix <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+fx+ld+euro+core, 
               data=euro, 
               model="within")

eurofix2 <- plm(rlogmod ~ db+gdp+logvix+gb+i+logtdo+ld+euro+core, 
                data=euro,
                model="within")

# AIC = log(RSS/N) + 2K/N  for linear models
# AIC = log(RSS/n) + 2K/n  for panel models

Sum1 <- summary(eurofix)
RSS1 <- sum(Sum1$residuals^2)
K1 <- max(eurofix$assign)
N1 <- length(eurofix$residuals)
n1 <- N1 - K1 - eurofix$df.residual

AIC_eurofix = log(RSS1/n1) + (2*K1)/n1

Sum2 <- summary(eurofix2)
RSS2 <- sum(Sum2$residuals^2)
K2 <- max(eurofix2$assign)
N2 <- length(eurofix2$residuals)
n2 <- N2 - K2 - eurofix2$df.residual

AIC_eurofix2 = log(RSS2/n2) + (2*K2)/n2

The lower AIC value is the preferred model!

like image 34
Glen Wade Avatar answered Nov 14 '22 23:11

Glen Wade