Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Standard errors discrepancies between SAS and R for GLM gamma distribution

I am comparing GLM output from R and SAS of a model with a Gamma distribution. The point estimations are identical, but they have different estimation of the standard error and therefore different p-values.

Does anyone know why? I am wondering if R and SAS uses different methods to estimate the standard errors? Maybe MLE vs. method of moments?

R sample code

set.seed(2)
test = data.table(y = rnorm(100, 1000, 100), x1 = rnorm(100, 50, 20), x2 = rgamma(100, 0.01))
model = summary(glm(formula = y ~ x1+x2 , family = Gamma(link = "log"), data = test))

Using the same data generated here I used the following code to run a model in SAS:

proc genmod data= test_data;
                model y =  x1 x2 /link= log dist= gamma;
    run;

The output from R is as following:

Call:
glm(formula = y ~ x1 + x2, family = Gamma(link = "log"), data = test)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.26213  -0.08456  -0.01033   0.08364   0.20878  

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  6.9210757  0.0324674 213.170   <2e-16 ***
x1          -0.0003371  0.0005985  -0.563    0.575    
x2           0.0234097  0.0627251   0.373    0.710    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.01376099)

    Null deviance: 1.3498  on 99  degrees of freedom
Residual deviance: 1.3436  on 97  degrees of freedom
AIC: 1240.6

Number of Fisher Scoring iterations: 4

The output from SAS:enter image description here

like image 582
YDao Avatar asked Jun 15 '17 22:06

YDao


1 Answers

By default R computes the scale=1/dispersion parameter in the same way as the sas/genmod/model option scale=pearson. The choice of scale parameter affects the SEs. See the documentation here: https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_genmod_sect022.htm

By default SAS/genmod gives MLEs of the shape parameter. Suppose the fitted gamma model is stored in the list "fit". To get this in R load the MASS library then type

gamma.shape(fit)

This gives the MLEs of shape parameter alpha. If you then type

summary(fit, dispersion=1/gamma.shape(fit)$alpha))

the summary function will use the MLE of alpha when computing the SEs, and they will match SAS/genmod exactly.

I will make a separate post on this. While summary.glm gives the correct SEs (using the specified value of dispersion), it does not print the correct value of AIC (It does not use the specified dispersion, instead it uses the value computed with the Pearson residuals). The difference is small, but I would call it a bug.

like image 73
Edward Malthouse Avatar answered Nov 05 '22 13:11

Edward Malthouse