Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Asterisk (*) vs. colon (:) in R formulas [closed]

Tags:

I always thought that * and : meant the same thing when adding interaction terms in R formulas. For example:

  • amount_of_gas ~ temperature*gas_type
  • amount_of_gas ~ temperature:gas_type

However, now that I've started using Generalized Linear Models (glm() in R) I see that these generate different scores, different estimates, etc. when I switch between the two. Can someone explain to me why this happens? Is it a problem with the stats package in R?

##UPDATE## April 26th, 2021

It's been ~4.5 years or so since I asked this question and I keep getting notified that it still has a lot of traffic. Here's the short answer: y~x*z basically means: y~x+z+x:z while y~x:z is just the interaction of x and z (as described in the answer below)

like image 428
Leo Ohyama Avatar asked Nov 12 '16 20:11

Leo Ohyama


1 Answers

From help(formula):

 In addition to ‘+’ and ‘:’, a number of other operators are useful
 in model formulae.  The ‘*’ operator denotes factor crossing:
 ‘a*b’ interpreted as ‘a+b+a:b’.
like image 163
Dirk Eddelbuettel Avatar answered Nov 15 '22 10:11

Dirk Eddelbuettel