Is there a way to get R to run all possible models (with all combinations of variables in a dataset) to produce the best/most accurate linear model and then output that model?
I feel like there is a way to do this, but I am having a hard time finding the information.
There are numerous ways this could be achieved, but for a simple way of doing this I would suggest that you have a look at the glmulti
package, which is described in detail in this paper:
Alternatively, very simple example of the model selection as available on the Quick-R website:
# Stepwise Regression
library(MASS)
fit <- lm(y~x1+x2+x3,data=mydata)
step <- stepAIC(fit, direction="both")
step$anova # display results
Or to simplify even more, you can do more manual model comparison:
fit1 <- lm(y ~ x1 + x2 + x3 + x4, data=mydata)
fit2 <- lm(y ~ x1 + x2, data=mydata)
anova(fit1, fit2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With