I am comparing GLM output from R and SAS of a model with a Gamma distribution. The point estimations are identical, but they have different estimation of the standard error and therefore different p-values.
Does anyone know why? I am wondering if R and SAS uses different methods to estimate the standard errors? Maybe MLE vs. method of moments?
R sample code
set.seed(2)
test = data.table(y = rnorm(100, 1000, 100), x1 = rnorm(100, 50, 20), x2 = rgamma(100, 0.01))
model = summary(glm(formula = y ~ x1+x2 , family = Gamma(link = "log"), data = test))
Using the same data generated here I used the following code to run a model in SAS:
proc genmod data= test_data;
model y = x1 x2 /link= log dist= gamma;
run;
The output from R is as following:
Call:
glm(formula = y ~ x1 + x2, family = Gamma(link = "log"), data = test)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.26213 -0.08456 -0.01033 0.08364 0.20878
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.9210757 0.0324674 213.170 <2e-16 ***
x1 -0.0003371 0.0005985 -0.563 0.575
x2 0.0234097 0.0627251 0.373 0.710
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Gamma family taken to be 0.01376099)
Null deviance: 1.3498 on 99 degrees of freedom
Residual deviance: 1.3436 on 97 degrees of freedom
AIC: 1240.6
Number of Fisher Scoring iterations: 4
The output from SAS:
By default R computes the scale=1/dispersion parameter in the same way as the sas/genmod/model option scale=pearson. The choice of scale parameter affects the SEs. See the documentation here: https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_genmod_sect022.htm
By default SAS/genmod gives MLEs of the shape parameter. Suppose the fitted gamma model is stored in the list "fit". To get this in R load the MASS library then type
gamma.shape(fit)
This gives the MLEs of shape parameter alpha. If you then type
summary(fit, dispersion=1/gamma.shape(fit)$alpha))
the summary function will use the MLE of alpha when computing the SEs, and they will match SAS/genmod exactly.
I will make a separate post on this. While summary.glm gives the correct SEs (using the specified value of dispersion), it does not print the correct value of AIC (It does not use the specified dispersion, instead it uses the value computed with the Pearson residuals). The difference is small, but I would call it a bug.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With