How does one perform a multivariate (multiple dependent variables) logistic regression in R?
I know you do this for linear regression, and this works
form <-cbind(A,B,C,D)~shopping_pt+price
mlm.model.1 <- lm(form, data = train)
But when I try the following (see below) for logistic regression, it does not work.
model.logistic <- glm(form, family=binomial(link=logit), data=train)
Thank you for your help.
To add, it appears that my code to do this with linear models above may not be correct. I am trying to what is outlined in this document, which some might find useful.
ftp://ftp.cis.upenn.edu/pub/datamining/public_html/ReadingGroup/papers/multiResponse.pdf
It seems to me that lm(cbind(A,B,C,D)~shopping_pt+price)
just fits four different models for the four dependent variables. The second link you provide even mentions:
The individual coefficients, as well as their standard errors will be the same as those produced by the multivariate regression. However, the OLS regressions will not produce multivariate results, nor will they allow for testing of coefficients across equations.
Meaning that all estimates will be the same, you'll just have to predict four times; and hypotheses on the fitted coefficients are independent across models.
I just tried this example below, showing that it indeed seems like that:
> set.seed(0)
> x1 <- runif(10)
> x2 <- runif(10)
> y1 <- 2*x1 + 3*x2 + rnorm(10)
> y2 <- 4*x1 + 5*x2 + rnorm(10)
> mm <- lm(cbind(y1,y2)~x1+x2)
> m1 <- lm(y1~x1+x2)
> m2 <- lm(y2~x1+x2)
# If we look at mm, m1 and m2, we see that models are identical
# If we predict new data, they give the same estimates
> x1_ <- runif(10)
> x2_ <- runif(10)
> predict(mm, newdata=list(x1=x1_, x2=x2_))
y1 y2
1 2.9714571 5.965774
2 2.7153855 5.327974
3 2.5101344 5.434516
4 1.3702441 3.853450
5 0.9447582 3.376867
6 2.3809256 5.051257
7 2.5782102 5.544434
8 3.1514895 6.156506
9 2.4421892 5.061288
10 1.6712042 4.470486
> predict(m1, newdata=list(x1=x1_, x2=x2_))
1 2 3 4 5 6 7 8 9 10
2.9714571 2.7153855 2.5101344 1.3702441 0.9447582 2.3809256 2.5782102 3.1514895 2.4421892 1.6712042
> predict(m2, newdata=list(x1=x1_, x2=x2_))
1 2 3 4 5 6 7 8 9 10
5.965774 5.327974 5.434516 3.853450 3.376867 5.051257 5.544434 6.156506 5.061288 4.470486
So this suggests that you can just fit four logistic models separately.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With