Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does a tilde (~) in front of a single variable mean (facet_wrap)?

Tags:

r

ggplot2

I am going through Hadley Wickham's "R for Data Science" where he uses ~var in ggplot calls.

I understand y ~ a + bx, where ~ describes a formula/relationship between dependent and independent variables, but what does ~var mean? More importantly, why can't you just put the variable itself? See code below:

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy)) + 
  facet_wrap(~ class, nrow = 2)

or

demo <- tribble(
  ~cut,         ~freq,
  "Fair",       1610,
  "Good",       4906,
  "Very Good",  12082,
  "Premium",    13791,
  "Ideal",      21551
)

ggplot(data = demo) +
  geom_bar(mapping = aes(x = cut, y = freq), stat = "identity")
like image 214
Hank Lin Avatar asked Aug 17 '18 18:08

Hank Lin


1 Answers

It's just ggplot making use of the formula structure to let the user decide what variables to facet on. From ?facet_grid:

For compatibility with the classic interface, rows can also be a formula with the rows (of the tabular display) on the LHS and the columns (of the tabular display) on the RHS; the dot in the formula is used to indicate there should be no faceting on this dimension (either row or column).

So facet_grid(. ~ var) just means to facet the grid on the variable var, with the facets spread over columns. It's the same as facet_grid(col = vars(var)).

Despite looking like a formula, it's not really being used as a formula: it's just a way to present multiple arguments to R in a way that the facet_grid code can clearly and unambiguously interpret.

like image 194
divibisan Avatar answered Sep 27 '22 18:09

divibisan