I would like for a colleague to replicate a first-difference linear panel data model that I am estimating with Stata with the plm
package in R (or some other package).
In Stata, xtreg
does not have a first difference option, so instead I run:
reg D.(y x), nocons cluster(ID)
In R, I am doing:
plm(formula = y ~ -1 + x, data = data, model = "fd", index = c("ID","Period"))
The coefficients match, but the standard errors in R are larger than in Stata. I looked in the plm
help and pdf documentation, but I must be missing something.
To find first differences, look at column 2: subtract the 1st number from the 2nd, the 2nd from the 3rd, etc. If these differences are all the same, then you have a linear relationship. If not, then the relationship is non-linear.
The first-differenced (FD) estimator is an approach that is used to address the problem of omitted variables in econometrics and statistics by using panel data. The estimator is obtained by running a pooled OLS estimation for a regression of the differenced variables.
In statistics and econometrics, the first-difference (FD) estimator is an estimator used to address the problem of omitted variables with panel data. It is consistent under the assumptions of the fixed effects model. In certain situations it can be more efficient than the standard fixed effects (or "within") estimator.
The standard errors are different because you use cluster
option in Stata.
R:
data(Grunfeld)
library(plm)
grun.re <- plm(inv~-1+value+capital,data=Grunfeld,model="fd")
> summary(grun.re)
Oneway (individual) effect First-Difference Model
Call:
plm(formula = inv ~ -1 + value + capital, data = Grunfeld, model = "fd")
Balanced Panel: n=10, T=20, N=200
Residuals :
Min. 1st Qu. Median Mean 3rd Qu. Max.
-202.00 -15.20 -1.76 -1.39 7.95 199.00
Coefficients :
Estimate Std. Error t-value Pr(>|t|)
value 0.0890628 0.0082341 10.816 < 2.2e-16 ***
capital 0.2786940 0.0471564 5.910 1.58e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Stata
reg D.(inv value capital), nocons
Source | SS df MS Number of obs = 190
-------------+------------------------------ F( 2, 188) = 70.58
Model | 259740.92 2 129870.46 Prob > F = 0.0000
Residual | 345936.615 188 1840.08838 R-squared = 0.4288
-------------+------------------------------ Adj R-squared = 0.4228
Total | 605677.536 190 3187.7765 Root MSE = 42.896
------------------------------------------------------------------------------
D.inv | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
value |
D1. | .0890628 .0082341 10.82 0.000 .0728197 .1053059
|
capital |
D1. | .278694 .0471564 5.91 0.000 .1856703 .3717177
If you want to cluster by group, here is the solution:
R:
library(lmtest) # for coeftest function
coeftest(grun.re,vcov=vcovHC(grun.re,type="HC0",cluster="group"))
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
value 0.089063 0.013728 6.4878 7.512e-10 ***
capital 0.278694 0.130954 2.1282 0.03462 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Stata:
. reg D.(inv value capital), nocons cluster(firm)
Linear regression Number of obs = 190
F( 2, 9) = 47.80
Prob > F = 0.0000
R-squared = 0.4288
Root MSE = 42.896
(Std. Err. adjusted for 10 clusters in firm)
------------------------------------------------------------------------------
| Robust
D.inv | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
value |
D1. | .0890628 .0145088 6.14 0.000 .0562416 .1218841
|
capital |
D1. | .278694 .138404 2.01 0.075 -.0343976 .5917857
------------------------------------------------------------------------------
You can see that there is slight difference. For details in R, see plm manual page 39 and also here plus here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With