I am going through Hadley Wickham's "R for Data Science" where he uses ~var in ggplot calls.
I understand y ~ a + bx, where ~ describes a formula/relationship between dependent and independent variables, but what does ~var mean? More importantly, why can't you just put the variable itself? See code below:
ggplot(data = mpg) +
geom_point(mapping = aes(x = displ, y = hwy)) +
facet_wrap(~ class, nrow = 2)
or
demo <- tribble(
~cut, ~freq,
"Fair", 1610,
"Good", 4906,
"Very Good", 12082,
"Premium", 13791,
"Ideal", 21551
)
ggplot(data = demo) +
geom_bar(mapping = aes(x = cut, y = freq), stat = "identity")
It's just ggplot making use of the formula structure to let the user decide what variables to facet on. From ?facet_grid:
For compatibility with the classic interface, rows can also be a formula with the rows (of the tabular display) on the LHS and the columns (of the tabular display) on the RHS; the dot in the formula is used to indicate there should be no faceting on this dimension (either row or column).
So facet_grid(. ~ var) just means to facet the grid on the variable var, with the facets spread over columns. It's the same as facet_grid(col = vars(var)).
Despite looking like a formula, it's not really being used as a formula: it's just a way to present multiple arguments to R in a way that the facet_grid code can clearly and unambiguously interpret.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With