I have a dataframe with variables, say a,b,c,d
dat <- data.frame(a=runif(1e5), b=runif(1e5), c=runif(1e5), d=runif(1e5))
and would like to generate all possible two-way interaction terms between each of the columns, that is: ab, ac, ad, bc, bd, cd. In reality my dataframe has over 100 columns, so I cannot code this manually. What is the most efficient way to do this (noting that I do not want both ab and ba)?
What do you plan to do with all these interaction terms? There are several options, which is best will depend on what you are trying to do.
If you want to pass the interactions to a modeling function like lm
or aov
then it is very simple, just use the .^2
syntax:
fit <- lm( y ~ .^2, data=mydf )
The above will call lm
and tell it to fit all the main effects and all 2 way interaction for the variables in mydf
excluding y
.
If for some reason you really want to calculate all the interactions then you can use model.matrix
:
tmp <- model.matrix( ~.^2, data=iris)
This will include a column for the intercept and columns for the main effects, but you can drop those if you don't want them.
If you need something different from the modeling then you can use the combn
function as @akrun mentions in the comments.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With