Can I specify a Random and a Fixed Effects model on Panel Data using lme4?
I am redoing Example 14.4 from Wooldridge (2013, p. 494-5) in r. Thanks to this site and this blog post I've manged to do it in the plm package, but I'm curious if I can do the same in the lme4 package?
Here's what I've done in the plm package. Would be grateful for any pointers as to how I can do the same using lme4. First, packages needed and loading of data,
# install.packages(c("wooldridge", "plm", "stargazer"), dependencies = TRUE)
library(wooldridge)
data(wagepan)
Second, I estimate the three models estimated in Example 14.4 (Wooldridge 2013) using the plm package,
library(plm)
Pooled.ols <- plm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + union +
factor(year), data = wagepan, index=c("nr","year") , model="pooling")
random.effects <- plm(lwage ~ educ + black + hisp + exper + I(exper^2) + married + union +
factor(year), data = wagepan, index = c("nr","year") , model = "random")
fixed.effects <- plm(lwage ~ I(exper^2) + married + union + factor(year),
data = wagepan, index = c("nr","year"), model="within")
Third, I output the resultants using stargazer to emulate Table 14.2 in Wooldridge (2013),
stargazer::stargazer(Pooled.ols,random.effects,fixed.effects, type="text",
column.labels=c("OLS (pooled)","Random Effects","Fixed Effects"),
dep.var.labels = c("log(wage)"), keep.stat=c("n"),
keep=c("edu","bla","his","exp","marr","union"), align = TRUE, digits = 4)
#> ======================================================
#> Dependent variable:
#> -----------------------------------------
#> log(wage)
#> OLS (pooled) Random Effects Fixed Effects
#> (1) (2) (3)
#> ------------------------------------------------------
#> educ 0.0913*** 0.0919***
#> (0.0052) (0.0107)
#>
#> black -0.1392*** -0.1394***
#> (0.0236) (0.0477)
#>
#> hisp 0.0160 0.0217
#> (0.0208) (0.0426)
#>
#> exper 0.0672*** 0.1058***
#> (0.0137) (0.0154)
#>
#> I(exper2) -0.0024*** -0.0047*** -0.0052***
#> (0.0008) (0.0007) (0.0007)
#>
#> married 0.1083*** 0.0640*** 0.0467**
#> (0.0157) (0.0168) (0.0183)
#>
#> union 0.1825*** 0.1061*** 0.0800***
#> (0.0172) (0.0179) (0.0193)
#>
#> ------------------------------------------------------
#> Observations 4,360 4,360 4,360
#> ======================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
is there an equally simple way to do this in lme4? Should I stick to plm? Why/Why not?
Panel data models examine cross-sectional (group) and/or time-series (time) effects. These effects may be fixed and/or random. Fixed effects assume that individual group/time have different intercept in the regression equation, while random effects hypothesize individual group/time have different disturbance.
The most important practical difference between the two is this: Random effects are estimated with partial pooling, while fixed effects are not. Partial pooling means that, if you have few data points in a group, the group's effect estimate will be based partially on the more abundant data from other groups.
Let's understand how the patients' response can be estimated using both the fixed-effects models and, the mixed model which combines both fixed and random effects.
Product Lifecycle Management (PLM) is an integrated business approach to the collaborative creation, management and dissemination of engineering information throughout the extended enterprise.
Excepted for the difference in estimation method it seems indeed to be mainly a question of vocabulary and syntax
# install.packages(c("wooldridge", "plm", "stargazer", "lme4"), dependencies = TRUE)
library(wooldridge)
library(plm)
#> Le chargement a nécessité le package : Formula
library(lme4)
#> Le chargement a nécessité le package : Matrix
data(wagepan)
Your first example is a simple linear model ignoring the groups nr
.
You can't do that with lme4 because there is no "random effect" (in the lme4
sense).
This is what Gelman & Hill call a complete pooling approach.
Pooled.ols <- plm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married +
union + factor(year), data = wagepan,
index=c("nr","year"), model="pooling")
Pooled.ols.lm <- lm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + union +
factor(year), data = wagepan)
Your second example seems to be equivalent to a random intercept mixed model with nr
as random effect (but the slopes of all predictors are fixed).
This is what Gelman & Hill call a partial pooling approach.
random.effects <- plm(lwage ~ educ + black + hisp + exper + I(exper^2) + married +
union + factor(year), data = wagepan,
index = c("nr","year") , model = "random")
random.effects.lme4 <- lmer(lwage ~ educ + black + hisp + exper + I(exper^2) + married +
union + factor(year) + (1|nr), data = wagepan)
Your third example seems to correspond to a case were nr
is a fixed effect and you
compute a different nr
intercept for each group.
Again : you can't do that with lme4
because there is no "random effect" (in the lme4
sense).
This is what Gelman & Hill call a "no pooling" approach.
fixed.effects <- plm(lwage ~ I(exper^2) + married + union + factor(year),
data = wagepan, index = c("nr","year"), model="within")
wagepan$nr <- factor(wagepan$nr)
fixed.effects.lm <- lm(lwage ~ I(exper^2) + married + union + factor(year) + nr,
data = wagepan)
Compare the results :
stargazer::stargazer(Pooled.ols, Pooled.ols.lm,
random.effects, random.effects.lme4 ,
fixed.effects, fixed.effects.lm,
type="text",
column.labels=c("OLS (pooled)", "lm no pool.",
"Random Effects", "lme4 partial pool.",
"Fixed Effects", "lm compl. pool."),
dep.var.labels = c("log(wage)"),
keep.stat=c("n"),
keep=c("edu","bla","his","exp","marr","union"),
align = TRUE, digits = 4)
#>
#> =====================================================================================================
#> Dependent variable:
#> ----------------------------------------------------------------------------------------
#> log(wage)
#> panel OLS panel linear panel OLS
#> linear linear mixed-effects linear
#> OLS (pooled) lm no pool. Random Effects lme4 partial pool. Fixed Effects lm compl. pool.
#> (1) (2) (3) (4) (5) (6)
#> -----------------------------------------------------------------------------------------------------
#> educ 0.0913*** 0.0913*** 0.0919*** 0.0919***
#> (0.0052) (0.0052) (0.0107) (0.0108)
#>
#> black -0.1392*** -0.1392*** -0.1394*** -0.1394***
#> (0.0236) (0.0236) (0.0477) (0.0485)
#>
#> hisp 0.0160 0.0160 0.0217 0.0218
#> (0.0208) (0.0208) (0.0426) (0.0433)
#>
#> exper 0.0672*** 0.0672*** 0.1058*** 0.1060***
#> (0.0137) (0.0137) (0.0154) (0.0155)
#>
#> I(exper2) -0.0024*** -0.0024*** -0.0047*** -0.0047*** -0.0052*** -0.0052***
#> (0.0008) (0.0008) (0.0007) (0.0007) (0.0007) (0.0007)
#>
#> married 0.1083*** 0.1083*** 0.0640*** 0.0635*** 0.0467** 0.0467**
#> (0.0157) (0.0157) (0.0168) (0.0168) (0.0183) (0.0183)
#>
#> union 0.1825*** 0.1825*** 0.1061*** 0.1053*** 0.0800*** 0.0800***
#> (0.0172) (0.0172) (0.0179) (0.0179) (0.0193) (0.0193)
#>
#> -----------------------------------------------------------------------------------------------------
#> Observations 4,360 4,360 4,360 4,360 4,360 4,360
#> =====================================================================================================
#> Note: *p<0.1; **p<0.05; ***p<0.01
Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press (a very very good book !)
Created on 2018-03-08 by the reprex package (v0.2.0).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With