As I understand it, when you fit a linear model in R using a nominal predictor, R essentially uses dummy 1/0 variables for each level (except the reference level), and then giving a regular old coefficient for each of these variables. What does it do for ordinal predictors? It gives me estimated coefficients for each level, so it's not just treating the ranks as a numerical predictor. How do I properly interpret output like this?
cars.data <- mtcars[,1:3]
cars.data$cyl.ord <- ordered(cars.data$cyl)
lm(mpg ~ disp + cyl.ord, data = cars.data)
Call: lm(formula = mpg ~ disp + cyl.ord) Coefficients: (Intercept) disp cyl.ord.L cyl.ord.Q 26.34212 -0.02731 -3.38852 1.95127
I originally assumed that I could use the coefficients from the ordinal variable the same way I would if it were an unordered categorical, but predict.lm(ord.model, newdata = data.frame("disp" = 150, "cyl.ord" = "6"))
gives 20.65263, while 26.34212378 -0.02730864*150 -3.38851642
gives only 18.85731, so that can't be it. (I tried +1.95127 and +0 in place of -3.38852 in case I was just mixed up about what order they were listed in, but no dice.) I couldn't find anything in the documentation for order()
or lm()
, online, or in any of my books.
How does R deal with ordinal predictors mathematically, and how should I deal with the output?
It appears that transforming a factor variable into an ordinal factor variable changes the default contrast from "contr.treatment" to "contr.poly"
# make cyl a factor
cars.data$cyl <- factor(cars.data$cyl)
Now, consider the outputs of lm
with "contr.treatment" and "contr.poly"
lm(mpg ~ disp + cyl.ord, data=cars.data)
Call:
lm(formula = mpg ~ disp + cyl.ord, data = cars.data)
Coefficients:
(Intercept) disp cyl.ord.L cyl.ord.Q
26.34212 -0.02731 -3.38852 1.95127
lm(mpg ~ disp + cyl, data=cars.data,
contrasts=list(cyl="contr.poly"))
Call:
lm(formula = mpg ~ disp + cyl, data = cars.data,
contrasts=list(cyl = "contr.poly"))
Coefficients:
(Intercept) disp cyl.L cyl.Q
26.34212 -0.02731 -3.38852 1.95127
Sor the ordered factor uses "contr.poly" as the default contrast and we can get the same results from an unordered factor. Now, consider the unordered factor.
lm(mpg ~ disp + cyl, data=cars.data)
Call:
lm(formula = mpg ~ disp + cyl, data = cars.data)
Coefficients:
(Intercept) disp cyl6 cyl8
29.53477 -0.02731 -4.78585 -4.79209
lm(mpg ~ disp + cyl.ord, data=cars.data, contrasts=list(cyl.ord="contr.treatment"))
Call:
lm(formula = mpg ~ disp + cyl.ord, data = cars.data,
contrasts=list(cyl.ord="contr.treatment"))
Coefficients:
(Intercept) disp cyl.ord6 cyl.ord8
29.53477 -0.02731 -4.78585 -4.79209
So, an unordered factor variable uses "contr.treatment" by default and we can get the same results from an ordered factor by explicitly asking for it.
But let's take a closer look at the model matrices that are used in the regressions.
# Show model matrix
model.matrix(mpg ~ disp + cyl, data=cars.data)
(Intercept) disp cyl6 cyl8
Mazda RX4 1 160.0 1 0
Mazda RX4 Wag 1 160.0 1 0
Datsun 710 1 108.0 0 0
...
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$cyl
[1] "contr.treatment"
Now, use "contr.poly" as contrast
model.matrix(mpg ~ disp + cyl, data=cars.data, contrasts.arg=list(cyl="contr.poly"))
(Intercept) disp cyl.L cyl.Q
Mazda RX4 1 160.0 -9.073800e-17 -0.8164966
Mazda RX4 Wag 1 160.0 -9.073800e-17 -0.8164966
Datsun 710 1 108.0 -7.071068e-01 0.4082483
...
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$cyl
[1] "contr.poly"
Next, check out cyl.ord in place of cyl
model.matrix(mpg ~ disp + cyl.ord, data=cars.data)
(Intercept) disp cyl.ord.L cyl.ord.Q
Mazda RX4 1 160.0 -9.073800e-17 -0.8164966
Mazda RX4 Wag 1 160.0 -9.073800e-17 -0.8164966
Datsun 710 1 108.0 -7.071068e-01 0.4082483
...
attr(,"assign")
[1] 0 1 2 2
attr(,"contrasts")
attr(,"contrasts")$cyl.ord
[1] "contr.poly"
The final two matrices have identical entries, so the use of "contr.poly" appears to explain the initial discrepencies. To learn more about contrasts, you can take a look at ?contrasts
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With