Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plotting lines and the group aesthetic in ggplot2

Tags:

r

ggplot2

This question follows on from an earlier question and its answers.

First some toy data:

df = read.table(text =  "School      Year    Value   A           1998    5  B           1999    10  C           2000    15  A           2000    7  B           2001    15  C           2002    20", sep = "", header = TRUE) 

The original question asked how to plot Value-Year lines for each School. The answers more or less correspond to p1 and p2 below. But also consider p3.

library(ggplot2)  (p1 <- ggplot(data = df, aes(x = Year, y = Value, colour = School)) +           geom_line() + geom_point())  (p2 <- ggplot(data = df, aes(x = factor(Year), y = Value, colour = School)) +          geom_line(aes(group = School)) + geom_point())  (p3 <- ggplot(data = df, aes(x = factor(Year), y = Value, colour = School)) +          geom_line() + geom_point()) 

Both p1 and p2 do the job. The difference between p1 and p2 is that p1 treats Year as numeric whereas p2 treats Year as a factor. Also, p2 contains a group aesthetic in geom_line. But when the group aesthetic is dropped as in p3, the lines are not drawn.

The question is: Why is the group aesthetic necessary when the x-axis variable is a factor but the group aesthetic is not needed when the x-axis variable is numeric?

enter image description here

like image 946
Sandy Muspratt Avatar asked Apr 27 '12 21:04

Sandy Muspratt


People also ask

What does the group aesthetic do in Ggplot?

There are two ways in which ggplot2 creates groups implicitly: If x or y are categorical variables, the rows with the same level form a group. Users often overlook this type of default grouping. If aesthetic mapping, such as color , shape , and fill , map to categorical variables, they subset the data into groups.

What is a group aesthetic in R?

The group aesthetic controls which rows of the data get grouped together for geom like geom_line() and geom_smooth() which use multiple rows to create one “thing” on the plot. When using geom_line() and color is discrete, group is automatically set to match it, so you get, for example, one line of each color.

What does geom_point () do in R?

The function geom_point() adds a layer of points to your plot, which creates a scatterplot.


1 Answers

In the words of Hadley himself:

The important thing [for a line graph with a factor on the horizontal axis] is to manually specify the grouping. By default ggplot2 uses the combination of all categorical variables in the plot to group geoms - that doesn't work for this plot because you get an individual line for each point. Manually specify group = 1 indicates you want a single line connecting all the points.

You can actually group the points in very different ways as demonstrated by koshke here

like image 129
daedalus Avatar answered Oct 03 '22 14:10

daedalus