In a regression model is it possible to include an interaction with only one dummy variable of a factor? For example, suppose I have:
x: numerical vector of 3 variables (1,2 and 3)
y: response variable
z: numerical vector
Is it possible to build a model like:
y ~ factor(x) + factor(x) : z
but only include the interaction with one level of X
? I realize that I could create a separate dummy variable for each level of x
, but I would like to simplify things if possible.
Really appreciate any input!!
One key point you're missing is that when you see a significant effect for something like x2:z
, that doesn't mean that x
interacts with z
when x == 2
, it means that the difference between x == 2
and x == 1
(or whatever your reference level is) interacts with z. It's not a level of x
that is interacting with z
, it's one of the contrasts that has been set for x
.
So for a 3 level factor with default treatment contrasts:
df <- data.frame(x = sample(1:3, 10, TRUE), y = rnorm(10), z = rnorm(10))
df$x <- factor(df$x)
contrasts(df$x)
2 3
1 0 0
2 1 0
3 0 1
if you really think that only the first contrast is important, you can create a new variable that compares x == 2
to x == 1
, and ignores x == 3
:
df$x_1vs2 <- NA
df$x_1vs2[df$x == 1] <- 0
df$x_1vs2[df$x == 2] <- 1
df$x_1vs2[df$x == 3] <- NA
And then run your regression using that:
lm(y ~ x_1vs2 + x_1vs2:z)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With