Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

expanding factor interactions within a formula

Tags:

r

I have many formulas (of class formula or Formula) of the form y ~ a*b, where a and b are factors.

I need to write a function that takes such a formula and returns a formula with all of the terms in the interaction "spelled out." Here is an example:

fac1 <- factor(c('a', 'a', 'b', 'b'))
fac2 <- factor(c('c', 'd', 'c', 'd'))
BigFormula(formula(x ~ fac1*fac2))

where BigFormula returns formula(x ~ a + b + c + d + a:c + a:d + b:c + b:d).

Is there a simple way to do this?

(The context: I am running many commands of the form anova(mod1, mod2), where mod2 nests in mod1, and where the right-hand side of both models contains terms like fac1*fac2. The point of these commands is to calculate F-statistics. The problem is that anova treats fac1*fac2 as three variables, even though it usually represents more than three variables. (In the code above, for example, fac1*fac2 represents eight variables.) As a result, anova underestimates the number of restrictions in the nested model, and it overestimates my degrees of freedom.)

like image 209
user697473 Avatar asked Jul 21 '12 19:07

user697473


3 Answers

I still have yet to learn all the tricks of formula, but if I want explicit formulas I'll tend to use sapply along with pasting:

# the factors
fac1 <- factor(c('a', 'a', 'b', 'b'))
fac2 <- factor(c('c', 'd', 'c', 'd'))

# create all the interaction terms
out <- sapply(levels(fac1), function(ii) {
  sapply(levels(fac2), function(jj) {
    paste0(ii,":",jj)
  })
})
# along with the single terms
terms <- c(levels(fac1), levels(fac2), as.vector(out))

# and create the rhs of the formula
rhs <- paste0(terms, collapse=" + ")

# finally add the lhs
f <- paste0("x ~ ", rhs)

We end up with:

> f
[1] "x ~ a + b + c + d + a:c + a:d + b:c + b:d"
like image 93
mindless.panda Avatar answered Nov 19 '22 03:11

mindless.panda


Look at the help for formula there may be existing things that will work for you.

For example the formula y ~ (a + b + c + d)^2 will give you all main effects and all 2 way interactions and the formula y ~ (a + b) * (c + d) gives the expansion that you show above. You can also subtract terms so y ~ a*b*c - a:b:c will not include the 3 way interaction.

like image 22
Greg Snow Avatar answered Nov 19 '22 02:11

Greg Snow


How about the following solution. I use a more extreme example of a complex interaction.

f = formula(y ~ a * b * c * d * e)

To spell out the interaction terms, we extract the terms from the value returned by terms.formula():

terms = attr(terms.formula(f), "term.labels")

which yields:

> terms
 [1] "a"         "b"         "c"         "d"         "e"         "a:b"       "a:c"      
 [8] "b:c"       "a:d"       "b:d"       "c:d"       "a:e"       "b:e"       "c:e"      
[15] "d:e"       "a:b:c"     "a:b:d"     "a:c:d"     "b:c:d"     "a:b:e"     "a:c:e"    
[22] "b:c:e"     "a:d:e"     "b:d:e"     "c:d:e"     "a:b:c:d"   "a:b:c:e"   "a:b:d:e"  
[29] "a:c:d:e"   "b:c:d:e"   "a:b:c:d:e"

And then we can convert it back to a formula:

f = as.formula(sprintf("y ~ %s", paste(terms, collapse="+")))

> f
y ~ a + b + c + d + e + a:b + a:c + b:c + a:d + b:d + c:d + a:e + 
    b:e + c:e + d:e + a:b:c + a:b:d + a:c:d + b:c:d + a:b:e + 
    a:c:e + b:c:e + a:d:e + b:d:e + c:d:e + a:b:c:d + a:b:c:e + 
    a:b:d:e + a:c:d:e + b:c:d:e + a:b:c:d:e
like image 8
themeo Avatar answered Nov 19 '22 02:11

themeo