I have a problem with glm function in R.
Specifically, I am not sure how to include nominal variables.
The results that I get in R after running the glm function are the following:
> df
x1 x2 y
1 a 2 0
2 b 4 1
3 a 4 0
4 b 2 1
5 a 4 1
6 b 2 0
> str(df)
'data.frame': 6 obs. of 3 variables:
$ x1: Factor w/ 2 levels "a","b": 1 2 1 2 1 2
$ x2: num 2 4 4 2 4 2
$ y: Factor w/ 2 levels "0","1": 1 2 1 2 2 1
Call:
glm(formula = y ~ x1 + x2, family = "binomial", data = df)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -39.132 15208.471 -0.003 0.998
x1b 19.566 7604.236 0.003 0.998
x2 9.783 3802.118 0.003 0.998
However, when I run the LogitModelFit function in Wolfram Mathematica I get different parameters.
The code in Wolfram is provided below:
data = {{a, 2, 0}, {b, 4, 1}, {a, 4, 0}, {b, 2, 1}, {a, 4, 1}, {b, 2, 0}};
model = LogitModelFit[data, {x, y}, {x, y}, NominalVariables -> x]
model["BestFitParameters"]
And these are my estimated parameters:
{-18.5661, -18.5661, 9.28303}
model // Normal
1/(1 + E^(18.5661 - 9.28303 y + 18.5661 DiscreteIndicator[x, a, {a, b}]))
So, what is different here? Why the results differ so much?
Am I doing something wrong in R or in Wolfram?
In R, logistic regression is performed using the glm( ) function, for general linear model. This function can fit several regression models, and the syntax specifies the request for a logistic regression model.
The logistic regression model is an example of a broad class of models known as generalized linear models (GLM).
Generalized linear model (GLM) is a generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution like Gaussian distribution.
You have effectively 4 groups, for which you are trying to estimate 3 parameters:
library(dplyr)
df %>% group_by(x1, x2) %>% summarise(n = n(), y = mean(y))
As you can see from the huge standard errors, the parameter estimates aren't stable. The standard errors for wolfram should also be very large (if they are given).
Second, wolfram, seems to use a different reference group, for x1:
> df$x1 <- relevel(df$x1, "b")
> m <- glm(y ~ x1 + x2, family = binomial(), data = df, control = list(maxit = 100))
> summary(m)
Call:
glm(formula = y ~ x1 + x2, family = binomial(), data = df, control = list(maxit = 100))
Deviance Residuals:
1 2 3 4 5 6
-0.00008 0.00008 -1.17741 1.17741 1.17741 -1.17741
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -19.566 7604.236 -0.003 0.998
x1a -19.566 7604.236 -0.003 0.998
x2 9.783 3802.118 0.003 0.998
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 8.3178 on 5 degrees of freedom
Residual deviance: 5.5452 on 3 degrees of freedom
AIC: 11.545
Number of Fisher Scoring iterations: 18
This is much closer to the result of wolfram (this is actually the same model as you have found; I just choose another reference group).
The predictions for both models (glm and wolfram) will be practically equal. Actually any model with the first two parameters very small (best model will be -Inf) and the third parameter equal to half of the first two (9.783*2 = 19.566) will give almost the same result.
The factor two originates from the fact that x2 takes on value 2 and 4, which differ by two.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With