Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Testing the Proportional Odds Assumption in R

I am working in R with a response variable that is the letter grade the student received in a specific course. The response is ordinal, and, in my opinion, seems logically proportional. My understanding is that I need to test that it is proportional before I can use polr() instead of multinom().

For one of my courses of data, I "tested" the proportionality like this:

M1 <- logLik(polrModel)  #'log Lik.' -1748.180691 (df=8)
M2 <- logLik(multinomModel)  #'log Lik.' -1734.775727 (df=20)
G <- -2*(M1$1 - M2$2)) #I used a block bracket here in the real code
# 26.8099283
pchisq(G,12,lower.tail = FALSE) #DF is #of predictors
#0.008228890393     #THIS P-VAL TELLS ME TO REJECT PROPORTIONAL

For a second way of testing the proportional odds assumption, I also ran two vglm models, one with family=cumulative(parallel =TRUE) the other with family=cumulative(parallel =FALSE). I then ran a pchisq() test with the difference of the models' deviances and the differences of the residual degrees of freedom.

Is either of these way respectable? If not, I would love help with the actual coding for determining whether to accept or reject the proportional odds assumption!

In addition to the above two tests, I graphed my cumulative probabilities against each of the predictors, individually. I read that I want these lines to be parallel. What I don't understand is, with polr() your output is a single slope for each independent variable (a coefficient) and then a specific intercept depending on which cumulative probability you are working with (ex: P(Y<=A), P(Y<=B), etc). So, if your slope coefficients are all the same for each of the equations, how could the lines not be parallel?

I picked up the basics of my knowledge on Chris Bilder's YouTube class; he talks about the parallel graphs here at minute 42.

Any help is appreciated! Thank you!

like image 597
Nameless Avatar asked May 03 '16 23:05

Nameless


1 Answers

Your approach is essentially correct. I have the following code inspired from Fox's “An R and S-PLUS companion to Applied Regression”. Chapter 5: Fitting generalized linear models. Pages 155-189. Please cite the book chapter when using the code. This chapter has a section on graphing also.

library(car)
library(nnet)
library(xlsx)
library(MASS)
options(warn=1)
options(digits = 3)
#
Trial <- read.xlsx("Trial.xls", "Sheet 1")
# Set up an out file structure
sink("Testing_adequacy_of_Prop_odds.txt")
# Trial$Outcome is assessed on a six point scale 0-5
schtyp_M_M.f <- factor(Trial$Outcome, labels = c("M0", "M1", "M2", "M3", "M4", "M5"))
#
cat("Multinomial logistic regression \n")
# Assign takes on a value of 1 (Treatment) or 0 (Control) 
mod.multinom <-multinom(schtyp_M_M.f~Assign, data = Trial)
print(summary(mod.multinom, cor=F, Wald=T))
x1<-logLik(mod.multinom)
cat("Degrees of freedom Multinomial logistic regression \n")
print(df_of_multinom_model <- attributes(x1)$df)
cat("Proportional odds logistic regression\n")
mod.polr <- polr(schtyp_M_M.f ~ Assign, data=Trial)
print(summary(mod.polr))
x2<-logLik(mod.polr)
cat("Degrees of freedom Proportional Odds Logistic Regression \n")
print(df_of_polr_model <- attributes(x2)$df)

cat("Answering the question: Is proportional odds model assumption violated\n")
cat("P value for difference in AIC between POLR and Multinomial Logit model\n")
# abs since the values could be negative. That is negative difference of degrees of freedom would produce p=NaN
print(1-pchisq(abs(mod.polr$deviance-mod.multinom$deviance),   abs(df_of_multinom_model-df_of_polr_model)))
sink()
like image 63
abcihep Avatar answered Sep 24 '22 11:09

abcihep