When I run a cluster standard error panel specification with <code>plm</code> and <code>lfe</code> I get results that differ at the second significant figure. Does anyone know why they differ in their calculation of the SE's? <pre class="prettyprint"><code>set.seed(572015) library(lfe) library(plm) library(lmtest) # clustering example x <- c(sapply(sample(1:20), rep, times = 1000)) + rnorm(20*1000, sd = 1) y <- 5 + 10*x + rnorm(20*1000, sd = 10) + c(sapply(rnorm(20, sd = 10), rep, times = 1000)) facX <- factor(sapply(1:20, rep, times = 1000)) mydata <- data.frame(y=y,x=x,facX=facX, state=rep(1:1000, 20)) model <- plm(y ~ x, data = mydata, index = c("facX", "state"), effect = "individual", model = "within") plmTest <- coeftest(model,vcov=vcovHC(model,type = "HC1", cluster="group")) lfeTest <- summary(felm(y ~ x | facX | 0 | facX)) data.frame(lfeClusterSE=lfeTest$coefficients[2], plmClusterSE=plmTest[2]) lfeClusterSE plmClusterSE 1 0.06746538 0.06572588 </code></pre>

The difference is in the degrees-of-freedom adjustment. This is the usual first guess when looking for differences in supposedly similar standard errors (see e.g., Different Robust Standard Errors of Logit Regression in Stata and R). Here, the problem can be illustrated when comparing the results from (1) <code>plm</code>+<code>vcovHC</code>, (2) <code>felm</code>, (3) <code>lm</code>+<code>cluster.vcov</code> (from package <code>multiwayvcov</code>). First, I refit all models: <pre class="prettyprint"><code>m1 <- plm(y ~ x, data = mydata, index = c("facX", "state"), effect = "individual", model = "within") m2 <- felm(y ~ x | facX | 0 | facX, data = mydata) m3 <- lm(y ~ facX + x, data = mydata) </code></pre> All lead to the same coefficient estimates. For <code>m3</code> the fixed effects are explicitly reported while they are not for <code>m1</code> and <code>m2</code>. Hence, for <code>m3</code> only the last coefficient is extracted with <code>tail(..., 1)</code>. <pre class="prettyprint"><code>all.equal(coef(m1), coef(m2)) ## [1] TRUE all.equal(coef(m1), tail(coef(m3), 1)) ## [1] TRUE </code></pre> The non-robust standard errors also agree. <pre class="prettyprint"><code>se <- function(object) tail(sqrt(diag(object)), 1) se(vcov(m1)) ## x ## 0.07002696 se(vcov(m2)) ## x ## 0.07002696 se(vcov(m3)) ## x ## 0.07002696 </code></pre> And when comparing the clustered standard errors we can now show that <code>felm</code> uses the degrees-of-freedom correction while <code>plm</code> does not: <pre class="prettyprint"><code>se(vcovHC(m1)) ## x ## 0.06572423 m2$cse ## x ## 0.06746538 se(cluster.vcov(m3, mydata$facX)) ## x ## 0.06746538 se(cluster.vcov(m3, mydata$facX, df_correction = FALSE)) ## x ## 0.06572423 </code></pre>

Clustered standard errors different in plm vs lfe

Tags:

r

plm

lfe

When I run a cluster standard error panel specification with plm and lfe I get results that differ at the second significant figure. Does anyone know why they differ in their calculation of the SE's?

set.seed(572015)
library(lfe)
library(plm)
library(lmtest)
# clustering example
x <- c(sapply(sample(1:20), rep, times = 1000)) + rnorm(20*1000, sd = 1)
y <- 5 + 10*x + rnorm(20*1000, sd = 10) + c(sapply(rnorm(20, sd = 10), rep, times = 1000))
facX <- factor(sapply(1:20, rep, times = 1000))
mydata <- data.frame(y=y,x=x,facX=facX, state=rep(1:1000, 20))
model <- plm(y ~ x, data = mydata, index = c("facX", "state"), effect = "individual", model = "within")
plmTest <- coeftest(model,vcov=vcovHC(model,type = "HC1", cluster="group"))
lfeTest <- summary(felm(y ~ x | facX | 0 | facX))
data.frame(lfeClusterSE=lfeTest$coefficients[2],
       plmClusterSE=plmTest[2])

lfeClusterSE plmClusterSE
1   0.06746538   0.06572588

852

asked May 08 '15 05:05

kennyB

1 Answers

The difference is in the degrees-of-freedom adjustment. This is the usual first guess when looking for differences in supposedly similar standard errors (see e.g., Different Robust Standard Errors of Logit Regression in Stata and R). Here, the problem can be illustrated when comparing the results from (1) plm+vcovHC, (2) felm, (3) lm+cluster.vcov (from package multiwayvcov).

First, I refit all models:

m1 <- plm(y ~ x, data = mydata, index = c("facX", "state"),
  effect = "individual", model = "within")
m2 <- felm(y ~ x | facX | 0 | facX, data = mydata)
m3 <- lm(y ~ facX + x, data = mydata)

All lead to the same coefficient estimates. For m3 the fixed effects are explicitly reported while they are not for m1 and m2. Hence, for m3 only the last coefficient is extracted with tail(..., 1).

all.equal(coef(m1), coef(m2))
## [1] TRUE
all.equal(coef(m1), tail(coef(m3), 1))
## [1] TRUE

The non-robust standard errors also agree.

se <- function(object) tail(sqrt(diag(object)), 1)
se(vcov(m1))
##          x 
## 0.07002696 
se(vcov(m2))
##          x 
## 0.07002696 
se(vcov(m3))
##          x 
## 0.07002696

And when comparing the clustered standard errors we can now show that felm uses the degrees-of-freedom correction while plm does not:

se(vcovHC(m1))
##          x 
## 0.06572423 
m2$cse
##          x 
## 0.06746538 
se(cluster.vcov(m3, mydata$facX))
##          x 
## 0.06746538 
se(cluster.vcov(m3, mydata$facX, df_correction = FALSE))
##          x 
## 0.06572423

135

answered Sep 24 '22 02:09

Achim Zeileis

Related questions
                            
                                How can I plot multiple variables side-by-side in a dotplot in R?
                            
                                Rcpp: Save compiled function as Robj
                            
                                Merge overlapping ranges into unique groups, in dataframe
                            
                                How to have a new line in a `bquote` expression used with `text`?
                            
                                factor analysis using R
                            
                                RStudio projects capabilities
                            
                                Adding R^2 on graph with facets
                            
                                Find NA values after using addNA()
                            
                                Extracting output from principal function in psych package as a data frame
                            
                                Stack points one above the other
                            
                                cannot install R package from source In Mac OSX Maverick [duplicate]
                            
                                Select equivalent rows [A-B & B-A] [duplicate]
                            
                                Order of factor levels changes when plotting layers with data subsets
                            
                                element wise multiplication in r
                            
                                How to build a package in an existing directory with RStudio
                            
                                caching plots in R/Shiny
                            
                                knitr not aligning figures to center in pdf output--alternatives?
                            
                                Remove some of the axis labels in ggplot faceted plots
                            
                                What are the benefits of defining and calling a function inside another function in R?
                            
                                Grouped barplot in R with error bars

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With