Hi I'm learning Statsmodel and can't figure out the difference between : and * (interaction terms) for formulas in StatsModels OLS regression. Could you please give me a hint to figure this out?
Thank you!
The documentation: http://statsmodels.sourceforge.net/devel/example_formulas.html
In regression, an interaction effect exists when the effect of an independent variable on a dependent variable changes, depending on the value(s) of one or more other independent variables.
Why include an interaction term? A model without interactions assumes that the effect of each predictor on the outcome is independent of other predictors in the model.
Interaction: An interaction occurs when an independent variable has a different effect on the outcome depending on the values of another independent variable.
Interaction terms are sometimes added to linear regression models when the effect of one variable depends on the value of another variable.
":" will give a regression without the level itself. just the interaction you have mentioned.
"*" will give a regression with the level itself + the interaction you have mentioned.
for example
a. GLMmodel = glm("y ~ a: b" , data = df)
you'll have only one independent variable which is the results of "a" multiply by "b"
b. GLMmodel = glm("y ~ a * b" , data = df)
you'll have 3 independent variables which is the results of "a" multiply by "b" + "a" itself + "b" itself
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With