Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - interaction with only one factor level in regression

Tags:

r

lm

In a regression model is it possible to include an interaction with only one dummy variable of a factor? For example, suppose I have:

x: numerical vector of 3 variables (1,2 and 3)
y: response variable
z: numerical vector

Is it possible to build a model like:

y ~ factor(x) + factor(x) : z

but only include the interaction with one level of X? I realize that I could create a separate dummy variable for each level of x, but I would like to simplify things if possible.

Really appreciate any input!!

like image 827
user2081788 Avatar asked Feb 18 '13 02:02

user2081788


1 Answers

One key point you're missing is that when you see a significant effect for something like x2:z, that doesn't mean that x interacts with z when x == 2, it means that the difference between x == 2 and x == 1 (or whatever your reference level is) interacts with z. It's not a level of x that is interacting with z, it's one of the contrasts that has been set for x.

So for a 3 level factor with default treatment contrasts:

df <- data.frame(x = sample(1:3, 10, TRUE), y = rnorm(10), z = rnorm(10))
df$x <- factor(df$x)
contrasts(df$x)
  2 3
1 0 0
2 1 0
3 0 1

if you really think that only the first contrast is important, you can create a new variable that compares x == 2 to x == 1, and ignores x == 3:

df$x_1vs2 <- NA
df$x_1vs2[df$x == 1] <- 0
df$x_1vs2[df$x == 2] <- 1
df$x_1vs2[df$x == 3] <- NA

And then run your regression using that:

lm(y ~ x_1vs2 + x_1vs2:z)
like image 153
Marius Avatar answered Sep 20 '22 13:09

Marius