my problem is this: I get NA
where I should get some values in the computation of robust standard errors.
I am trying to do a fixed effect panel regression with cluster-robust standard errors. For this, I follow Arai (2011) who on p. 3 follows Stock/ Watson (2006) (later published in Econometrica, for those who have access). I would like to correct the degrees of freedom by (M/(M-1)*(N-1)/(N-K)
against downward bias as my number of clusters is finite and I have unbalanced data.
Similar problems have been posted before [1, 2] on StackOverflow and related problems [3] on CrossValidated.
Arai (and the answer in the 1st link) uses the following code for functions (I provide my data below with some further comment):
gcenter <- function(df1,group) {
variables <- paste(
rep("C", ncol(df1)), colnames(df1), sep=".")
copydf <- df1
for (i in 1:ncol(df1)) {
copydf[,i] <- df1[,i] - ave(df1[,i], group,FUN=mean)}
colnames(copydf) <- variables
return(cbind(df1,copydf))}
# 1-way adjusting for clusters
clx <- function(fm, dfcw, cluster){
# R-codes (www.r-project.org) for computing
# clustered-standard errors. Mahmood Arai, Jan 26, 2008.
# The arguments of the function are:
# fitted model, cluster1 and cluster2
# You need to install libraries `sandwich' and `lmtest'
# reweighting the var-cov matrix for the within model
library(sandwich);library(lmtest)
M <- length(unique(cluster))
N <- length(cluster)
K <- fm$rank
dfc <- (M/(M-1))*((N-1)/(N-K))
uj <- apply(estfun(fm),2, function(x) tapply(x, cluster, sum));
vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N)*dfcw
coeftest(fm, vcovCL) }
,where the gcenter
computes deviations from the mean (fixed effect). I then continue and do the regression with DS_CODE
being my cluster variable (I have named my data 'data').
centerdata <- gcenter(data, data$DS_CODE)
datalm <- lm(C.L1.retE1M ~ C.MCAP_SEC + C.Impact_change + C.Mom + C.BM + C.PD + C.CashGen + C.NITA + C.PE + C.PEdummy + factor(DS_CODE), data=centerdata)
M <- length(unique(data$DS_CODE))
dfcw <- datalm$df / (datalm$df - (M-1))
and want to calculate
clx(datalm, dfcw, data$DS_CODE)
However, when I want to compute uj (see formula clx
above) for the variance, I get only at the beginning some values for my regressors, then lots of zeros. If this input uj is used for the variance, only NAs
result.
My data
Since my data may be of special structure and I can't figure out the problem, I post the entire thing as a link from Hotmail. The reason is that with other data (taken from Arai (2011)) my problem does not occur. Sorry in advance for the mess but I'd be very grateful if you could have a look at it nevertheless. The file is a 5mb .txt file containing purely data.
After some time playing around, it works for me and gives me:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.5099e-16 5.2381e-16 0.8610 0.389254
C.MCAP_SEC -5.9769e-07 1.2677e-07 -4.7149 2.425e-06 ***
C.Impact_change -5.3908e-04 7.5601e-05 -7.1306 1.014e-12 ***
C.Mom 3.7560e-04 3.3378e-03 0.1125 0.910406
C.BM -1.6438e-04 1.7368e-05 -9.4645 < 2.2e-16 ***
C.PD 6.2153e-02 3.8766e-02 1.6033 0.108885
C.CashGen -2.7876e-04 1.4031e-02 -0.0199 0.984149
C.NITA -8.1792e-02 3.2153e-02 -2.5438 0.010969 *
C.PE -6.6170e-06 4.0138e-06 -1.6485 0.099248 .
C.PEdummy 1.3143e-02 4.8864e-03 2.6897 0.007154 **
factor(DS_CODE)130324 -5.2497e-16 5.2683e-16 -0.9965 0.319028
factor(DS_CODE)130409 -4.0276e-16 5.2384e-16 -0.7689 0.441986
factor(DS_CODE)130775 -4.4113e-16 5.2424e-16 -0.8415 0.400089
...
This leaves us with the question why it doesn't for you. I guess it has something to do with the format of your data. Is everything numeric? I converted the column classes and it looks like that for me:
str(dat)
'data.frame': 48251 obs. of 12 variables:
$ DS_CODE : chr "902172" "902172" "902172" "902172" ...
$ DNEW : num 2e+05 2e+05 2e+05 2e+05 2e+05 ...
$ MCAP_SEC : num 78122 71421 81907 80010 82462 ...
$ NITA : num 0.135 0.135 0.135 0.135 0.135 ...
$ CashGen : num 0.198 0.198 0.198 0.198 0.198 ...
$ BM : num 0.1074 0.1108 0.097 0.0968 0.0899 ...
$ PE : num 57 55.3 63.1 63.2 68 ...
$ PEdummy : num 0 0 0 0 0 0 0 0 0 0 ...
$ L1.retE1M : num -0.72492 0.13177 0.00122 0.07214 -0.07332 ...
$ Mom : num 0 0 0 0 0 ...
$ PD : num 5.41e-54 1.51e-66 3.16e-80 2.87e-79 4.39e-89 ...
$ Impact_change: num 0 -10.59 -10.43 0.7 -6.97 ...
What does str(data)
return for you?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With