How can I use ddply function for linear model?
x1 <- c(1:10, 1:10)
x2 <- c(1:5, 1:5, 1:5, 1:5)
x3 <- c(rep(1,5), rep(2,5), rep(1,5), rep(2,5))
set.seed(123)
y <- rnorm(20, 10, 3)
mydf <- data.frame(x1, x2, x3, y)
require(plyr)
ddply(mydf, mydf$x3, .fun = lm(mydf$y ~ mydf$X1 + mydf$x2))
This generates this error:
Error in model.frame.default(formula = mydf$y ~ mydf$X1 + mydf$x2, drop.unused.levels = TRUE) : invalid type (NULL) for variable 'mydf$X1'
Appreciate your help.
Here is what you need to do.
mods = dlply(mydf, .(x3), lm, formula = y ~ x1 + x2)
mods is a list of two objects containing the regression results. you can extract what you need from mods. for example, if you want to extract the coefficients, you could write
coefs = ldply(mods, coef)
This gives you
x3 (Intercept) x1 x2
1 1 11.71015 -0.3193146 NA
2 2 21.83969 -1.4677690 NA
EDIT. If you want ANOVA
, then you can just do
ldply(mods, anova)
x3 Df Sum Sq Mean Sq F value Pr(>F)
1 1 1 2.039237 2.039237 0.4450663 0.52345980
2 1 8 36.654982 4.581873 NA NA
3 2 1 43.086916 43.086916 4.4273907 0.06849533
4 2 8 77.855187 9.731898 NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With