I am going through Hadley Wickham's "R for Data Science" where he uses ~var
in ggplot calls.
I understand y ~ a + bx
, where ~
describes a formula/relationship between dependent and independent variables, but what does ~var
mean? More importantly, why can't you just put the variable itself? See code below:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
or
demo <- tribble(
~cut, ~freq,
"Fair", 1610,
"Good", 4906,
"Very Good", 12082,
"Premium", 13791,
"Ideal", 21551
)
ggplot(data = demo) +
geom_bar(mapping = aes(x = cut, y = freq), stat = "identity")
It's just ggplot
making use of the formula
structure to let the user decide what variables to facet on. From ?facet_grid
:
For compatibility with the classic interface, rows can also be a formula with the rows (of the tabular display) on the LHS and the columns (of the tabular display) on the RHS; the dot in the formula is used to indicate there should be no faceting on this dimension (either row or column).
So facet_grid(. ~ var)
just means to facet the grid on the variable var
, with the facets spread over columns. It's the same as facet_grid(col = vars(var))
.
Despite looking like a formula
, it's not really being used as a formula: it's just a way to present multiple arguments to R in a way that the facet_grid
code can clearly and unambiguously interpret.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With