Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

first-difference linear panel model variance in R and Stata

I would like for a colleague to replicate a first-difference linear panel data model that I am estimating with Stata with the plm package in R (or some other package).

In Stata, xtreg does not have a first difference option, so instead I run:

reg D.(y x), nocons cluster(ID)

In R, I am doing:

plm(formula = y ~ -1 + x, data = data, model = "fd", index = c("ID","Period"))

The coefficients match, but the standard errors in R are larger than in Stata. I looked in the plm help and pdf documentation, but I must be missing something.

like image 554
dimitriy Avatar asked Sep 26 '13 01:09

dimitriy


People also ask

How do you do first differencing?

To find first differences, look at column 2: subtract the 1st number from the 2nd, the 2nd from the 3rd, etc. If these differences are all the same, then you have a linear relationship. If not, then the relationship is non-linear.

What is first difference in panel data?

The first-differenced (FD) estimator is an approach that is used to address the problem of omitted variables in econometrics and statistics by using panel data. The estimator is obtained by running a pooled OLS estimation for a regression of the differenced variables.

What is first differences in statistics?

In statistics and econometrics, the first-difference (FD) estimator is an estimator used to address the problem of omitted variables with panel data. It is consistent under the assumptions of the fixed effects model. In certain situations it can be more efficient than the standard fixed effects (or "within") estimator.


1 Answers

The standard errors are different because you use cluster option in Stata.

R:

data(Grunfeld)
library(plm)
grun.re <- plm(inv~-1+value+capital,data=Grunfeld,model="fd")
> summary(grun.re)
Oneway (individual) effect First-Difference Model

Call:
plm(formula = inv ~ -1 + value + capital, data = Grunfeld, model = "fd")

Balanced Panel: n=10, T=20, N=200

Residuals :
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
-202.00  -15.20   -1.76   -1.39    7.95  199.00 

Coefficients :
         Estimate Std. Error t-value  Pr(>|t|)    
value   0.0890628  0.0082341  10.816 < 2.2e-16 ***
capital 0.2786940  0.0471564   5.910  1.58e-08 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Stata

 reg D.(inv value capital), nocons

      Source |       SS       df       MS              Number of obs =     190
-------------+------------------------------           F(  2,   188) =   70.58
       Model |   259740.92     2   129870.46           Prob > F      =  0.0000
    Residual |  345936.615   188  1840.08838           R-squared     =  0.4288
-------------+------------------------------           Adj R-squared =  0.4228
       Total |  605677.536   190   3187.7765           Root MSE      =  42.896

------------------------------------------------------------------------------
       D.inv |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       value |
         D1. |   .0890628   .0082341    10.82   0.000     .0728197    .1053059
             |
     capital |
         D1. |    .278694   .0471564     5.91   0.000     .1856703    .3717177

If you want to cluster by group, here is the solution:

R:

library(lmtest) # for coeftest function
coeftest(grun.re,vcov=vcovHC(grun.re,type="HC0",cluster="group"))

t test of coefficients:

        Estimate Std. Error t value  Pr(>|t|)    
value   0.089063   0.013728  6.4878 7.512e-10 ***
capital 0.278694   0.130954  2.1282   0.03462 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Stata:

. reg D.(inv value capital), nocons cluster(firm)

Linear regression                                      Number of obs =     190
                                                       F(  2,     9) =   47.80
                                                       Prob > F      =  0.0000
                                                       R-squared     =  0.4288
                                                       Root MSE      =  42.896

                                  (Std. Err. adjusted for 10 clusters in firm)
------------------------------------------------------------------------------
             |               Robust
       D.inv |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       value |
         D1. |   .0890628   .0145088     6.14   0.000     .0562416    .1218841
             |
     capital |
         D1. |    .278694    .138404     2.01   0.075    -.0343976    .5917857
------------------------------------------------------------------------------

You can see that there is slight difference. For details in R, see plm manual page 39 and also here plus here

like image 123
Metrics Avatar answered Sep 23 '22 12:09

Metrics