Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multivariate logistic regression in r?

How does one perform a multivariate (multiple dependent variables) logistic regression in R?

I know you do this for linear regression, and this works

form <-cbind(A,B,C,D)~shopping_pt+price
mlm.model.1 <- lm(form, data = train)

But when I try the following (see below) for logistic regression, it does not work.

model.logistic <- glm(form, family=binomial(link=logit), data=train)

Thank you for your help.

To add, it appears that my code to do this with linear models above may not be correct. I am trying to what is outlined in this document, which some might find useful.

ftp://ftp.cis.upenn.edu/pub/datamining/public_html/ReadingGroup/papers/multiResponse.pdf

like image 224
blast00 Avatar asked May 09 '14 00:05

blast00


1 Answers

It seems to me that lm(cbind(A,B,C,D)~shopping_pt+price) just fits four different models for the four dependent variables. The second link you provide even mentions:

The individual coefficients, as well as their standard errors will be the same as those produced by the multivariate regression. However, the OLS regressions will not produce multivariate results, nor will they allow for testing of coefficients across equations.

Meaning that all estimates will be the same, you'll just have to predict four times; and hypotheses on the fitted coefficients are independent across models.

I just tried this example below, showing that it indeed seems like that:

> set.seed(0)
> x1 <- runif(10)
> x2 <- runif(10)
> y1 <- 2*x1 + 3*x2 + rnorm(10)
> y2 <- 4*x1 + 5*x2 + rnorm(10)
> mm <- lm(cbind(y1,y2)~x1+x2)
> m1 <- lm(y1~x1+x2)
> m2 <- lm(y2~x1+x2)
# If we look at mm, m1 and m2, we see that models are identical
# If we predict new data, they give the same estimates
> x1_ <- runif(10)
> x2_ <- runif(10)
> predict(mm, newdata=list(x1=x1_, x2=x2_))
          y1       y2
1  2.9714571 5.965774
2  2.7153855 5.327974
3  2.5101344 5.434516
4  1.3702441 3.853450
5  0.9447582 3.376867
6  2.3809256 5.051257
7  2.5782102 5.544434
8  3.1514895 6.156506
9  2.4421892 5.061288
10 1.6712042 4.470486
> predict(m1, newdata=list(x1=x1_, x2=x2_))
        1         2         3         4         5         6         7         8         9        10 
2.9714571 2.7153855 2.5101344 1.3702441 0.9447582 2.3809256 2.5782102 3.1514895 2.4421892 1.6712042 
> predict(m2, newdata=list(x1=x1_, x2=x2_))
       1        2        3        4        5        6        7        8        9       10 
5.965774 5.327974 5.434516 3.853450 3.376867 5.051257 5.544434 6.156506 5.061288 4.470486

So this suggests that you can just fit four logistic models separately.

like image 167
Julián Urbano Avatar answered Sep 19 '22 08:09

Julián Urbano