Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Condition ( | ) in R formula

Tags:

r

r-formula

I found this pdf on R formulas and I am not able to figure out how the | works (see the table on the second page). Furthermore, I could not find any explanation on the web. It appears from time to time in lists for possible formula symbols but without any example.

I think that it might be out of date because of other ways to achieve whatever it did.

Does anybody know how to use | in a formula and what it exactly achieves?

A bit of code with shows my clumsy attempt to use |.

x <- rnorm(100)
y <- rnorm(100)
z <- sample(c(TRUE, FALSE), 100, replace = TRUE )

lm(y ~ x|z)
like image 271
Alex Avatar asked Feb 23 '17 14:02

Alex


People also ask

How do I write a condition in R?

The if statement takes a condition; if the condition evaluates to TRUE , the R code associated with the if statement is executed. The condition to check appears inside parentheses, while the R code that has to be executed if the condition is TRUE , follows in curly brackets ( expr ).

How do I do an IF THEN statement in R?

To run an if-then statement in R, we use the if() {} function. The function has two main elements, a logical test in the parentheses, and conditional code in curly braces. The code in the curly braces is conditional because it is only evaluated if the logical test contained in the parentheses is TRUE .

What is formula used for in R?

Why Use Formulae in R? As you have seen, formulas powerful, general-purpose tools that allow you to capture the values of variables without evaluating them so that they can be interpreted by the function.


2 Answers

The symbol | means different things depending on the context:

The general case

In general, | means OR. General modeling functions will see any | as a logic operator and carry it out. This is the equivalent of using another operator, eg ^ as in:

lm(y~ x + x^2)

The operator is carried out first, and this new variable is then used to construct the model matrix and do the fitting.

In your code, | also means OR. You have to keep in mind that R interpretes numeric values also as logical when you use any logical operator. A 0 is seen as FALSE, anything else as TRUE.

So your call to lm constructs a model of y in function of x OR z. This doesn't make any sense. Given the values of x, this will just be y ~ TRUE. This is also the reason your model doesn't fit. Your model matrix has 2 columns with 1's, one for the intercept and one for the only value in x|z, being TRUE. Hence your coefficient for x|z can't even be calculated, as shown from the output:

> lm(y ~ x|z)

Call:
lm(formula = y ~ x | z)

Coefficients:
(Intercept)    x | zTRUE  
   -0.01925           NA  

Inside formulas for mixed models

In mixed models (eg lme4 package), | is used to indicate a random effect. A term like + 1|X means: "fit a random intercept for every category in X". You can translate the | as "given". So you can see the term as "fit an intercept, given X". If you keep this in mind, the use of | in specifications of correlation structures in eg the nlme or mgcv will make more sense to you.

You still have to be careful, as the exact way | is interpreted depends largely on the package you use. So the only way to really know what it means in the context of the modeling function you use, is to check that in the manual of that package.

Other uses

There are some other functions and packages that use the | symbol in a formula interface. Also here it pretty much boils down to indicating some kind of group. One example is the use of | in the lattice graphic system. There it is used for faceting, as shown by the following code:

library(lattice)
densityplot(~Sepal.Width|Species,
            data = iris,
            main="Density Plot by Species",
            xlab="Sepal width")
like image 81
Joris Meys Avatar answered Oct 04 '22 15:10

Joris Meys


The general way it is used is dependent ~ independent | grouping You can read more here http://talklab.psy.gla.ac.uk/KeepItMaximalR2.pdf

like image 25
Dinesh.hmn Avatar answered Oct 04 '22 17:10

Dinesh.hmn