Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When does the argument go inside or outside aes()?

Tags:

r

ggplot2

I am following Chapter 1 of Wickham and Grolemund's "R for data science" on visualization.

I have tried:

 ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))

hoping to achieve a plot with all points colored blue, but instead, to my surprise, they were all red! Reading the correct code to achieve the blue points, in page 11 of the printed version or in Section 3.3 of the online version, I found it should be

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y = hwy), color = "blue")

and, in fact, they state that, to manually set an aesthetic you have to give it outside the aes() function, but inside the corresponding geom, geom_point() here. Why is it so? What is the exact explanation for this behavior? In fact, it seemed natural to me that the correct syntax would be the one of the first command.I guess this issue is related either to layers and/or to scope of variables, but I just could not get the hang of it... Can someone spoon feed me?

Edit: Sorry for not doing my correct homework: this is just Exercise 1 proposed in the text itself at the end of the corresponding Section... The answer however still escapes me.

like image 649
Mauricio Calvao Avatar asked Jan 25 '17 22:01

Mauricio Calvao


1 Answers

This issue and more specifically the difference in the output from the two mentioned commands are explicitly dealt with in Section 5.4.2 of the 2nd edition of "ggplot2. Elegant graphics for data analysis", by Hadley Wickham himself:

Either:

  • you can map (inside aes) a variable of your data to an aesthetic, e.g., aes(..., color = VarX), or ...
  • you can set (outside aes, but inside a geom element) an aesthetic to a constant value e.g. "blue"

In the first case, of mapping an aesthetic, such as color, ggplot2 chooses a color based on a kind of uniform average of all available colors (at the colorwheel), because the values of the mapped variable are all constant; why should the chosen color coincide with the constant value you happend to choose to map from? More explicitly, if you try the command:

ggplot(data = mpg) + geom_point(mapping = aes(x = displ, y =hwy, color = "foo"))

you get exactly the same output plot as in the first command of the original question.

like image 132
Mauricio Calvao Avatar answered Sep 25 '22 14:09

Mauricio Calvao