Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fitted values for multinom in R: Coefficients for Reference Category?

Tags:

r

multinomial

I'm using the function multinom from the nnet package to run a multinomial logistic regression.

In multinomial logistic regression, as I understand it, the coefficients are the changes in the log of the ratio of the probability of a response over the probability of the reference response (i.e., ln(P(i)/P(r))=B1+B2*X... where i is one response category, r is the reference category, and X is some predictor).

However, fitted(multinom(...)) produces estimates for each category, even the reference category r.

EDIT Example:

set.seed(1)
library(nnet)
DF <- data.frame(X = as.numeric(rnorm(30)), 
                 Y = factor(sample(letters[1:5],30, replace=TRUE)))
DF$Y<-relevel(DF$Y, ref="a") #ensure a is the reference category
model <- multinom(Y ~ X, data = DF)
coef(model)
#  (Intercept)           X
#b   0.1756835  0.55915795
#c  -0.2513414 -0.31274745
#d   0.1389806 -0.12257963
#e  -0.4034968  0.06814379

head(fitted(model))
#          a         b          c         d         e
#1 0.2125982 0.2110692 0.18316042 0.2542913 0.1388810
#2 0.2101165 0.1041655 0.26694618 0.2926508 0.1261210
#3 0.2129182 0.2066711 0.18576567 0.2559369 0.1387081
#4 0.1733332 0.4431170 0.08798363 0.1685015 0.1270647
#5 0.2126573 0.2102819 0.18362323 0.2545859 0.1388516
#6 0.1935449 0.3475526 0.11970164 0.2032974 0.1359035

head(DF)
#           X Y
#1 -0.3271010 a

To calculate the predicted probability ratio between response b and response a for row 1, we calculate exp(0.1756835+0.55915795*(-0.3271010))=0.9928084. And I see that this corresponds to the fitted P(b)/P(a) for row 1 (0.2110692/0.2125982=0.9928084).

Is the fitted probability for the reference category calculated algebraically (e.g., 0.2110692/exp(0.1756835+0.55915795*(-0.3271010)))?

Is there a way to obtain the equation for the predicted probability of the reference category?

like image 844
jflournoy Avatar asked Jun 24 '13 19:06

jflournoy


2 Answers

I had the same question, and after looking around I think the solution is: given 3 classes: a,b,c and the fitted(model) probabilities pa,pb,pc output by the algorithm, you can reconstruct those probabilities from these 3 equations:

log(pb/pa) = beta1*X

log(pc/pa) = beta2*X

pa+pb+pc=1

Where beta1,beta2 are the rows of the output of coef(model), and X is your input data.

Playing with those equations you get to:

pb = exp(beta1*X)/(1+exp(beta1*X)+exp(beta2*X))

pc = exp(beta2*X)/(1+exp(beta1*X)+exp(beta2*X))

pa = 1 - pb - pc
like image 171
Gabriela Avatar answered Nov 12 '22 19:11

Gabriela


The key here is that in the help file for multinom() it says that "A log-linear model is fitted, with coefficients zero for the first class."

So that means the predicted values for the reference class can be calculated directly assuming that the coefficients for class "a" are both zero. For example, for the sample row given above, we could calculate the predicted probability for class "a" using the softmax transform: exp(0+0)/(exp(0+0) + exp(0.1756835 + 0.55915795*(-0.3271010)) + exp(-0.2513414 + (-0.31274745)*(-0.3271010)) + exp(0.1389806 + (-0.12257963)*(-0.3271010)) + exp(-0.4034968 + 0.06814379*(-0.3271010)))

or perhaps more simply, using non-hard-coded numbers, we can calculate the entire set of probabilities for the first row of data as:

softMax <- function(x){
    expx <- exp(x)
    return(expx/sum(expx))
}
coefs <- rbind(c(0,0), coef(model))
linear.predictor <- as.vector(coefs%*%c(1,-0.3271010))
softMax(linear.predictor)

FWIW: the example in the original question does not reproduce for me exactly, my seed gives different random deviates. So I have reproduced the example freshly and with my calculations below.

library(nnet)
set.seed(1)
DF <- data.frame(
    X = as.numeric(rnorm(30)), 
    Y = factor(sample(letters[1:5],30, replace=TRUE)))
DF$Y<-relevel(DF$Y, ref="a") #ensure a is the reference category
model <- multinom(Y ~ X, data = DF)
coef(model)
##   (Intercept)             X
## b -0.33646439  1.200191e-05
## c -0.36390688 -1.773889e-01
## d -0.45197598  1.049034e+00
## e -0.01418543  3.076309e-01

DF[1,]
##            X Y
## 1 -0.6264538 c

fitted.values(model)[1,]
##          a          b          c          d          e 
## 0.27518921 0.19656378 0.21372240 0.09076844 0.22375617 

coefs <- rbind(c(0,0), coef(model))
linear.predictor <- as.vector(coefs%*%c(1,DF[1,"X"]))
softMax(linear.predictor)
## [1] 0.27518921 0.19656378 0.21372240 0.09076844 0.22375617
like image 30
Nicholas G Reich Avatar answered Nov 12 '22 19:11

Nicholas G Reich