I am trying to control the order of items in a legend in a ggplot2
plot in R. I looked up some other similar questions and found out about changing the order of the levels of the factor variable I am plotting. I am plotting data for 4 months, December, January, July, and June.
If I just do one plot command for all the months, it works as expected with the months ordered in the legend appearing in the order of the levels of the factor. However, I need to have a different dodge
value for the summer (June & July) and winter (Dec & Jan) data. I do this with two geom_pointrange
commands. When I divide it into 2 steps, the order of the legend goes back to alphabetical. You can demonstrate by commenting out the "plot summer" or "plot winter" command.
What can I change to keep my factor level order in the legend?
Please ignore the odd looking test data - the real data looks fine in this plot format.
#testdata
hour <- rep(seq(from=1,to=24,by=1),4)
avg_hou <- sample(seq(0,0.5,0.001),96,replace=TRUE)
lower_ci <- avg_hou - sample(seq(0,0.05,0.001),96,replace=TRUE)
upper_ci <- avg_hou + sample(seq(0,0.05,0.001),96,replace=TRUE)
Month <- c(rep("December",24), rep("January",24), rep("June",24), rep("July",24))
testdata <- data.frame(Month,hour,avg_hou,lower_ci,upper_ci)
testdata$Month <- factor(alldata$Month,levels=c("June", "July", "December","January"))
#basic plot setup
plotx <- ggplot(testdata, aes(x = hour, y = avg_hou, ymin = lower_ci, ymax = upper_ci, color = Month, shape = Month))
plotx <- plotx + scale_color_manual(values = c("June" = "#FDB863", "July" = "#E66101", "December" = "#92C5DE", "January" = "#0571B0"))
#plot summer
plotx <- plotx + geom_pointrange(data = testdata[testdata$Month == "June" | testdata$Month == "July",], size = 1, position=position_dodge(width=0.3))
#plot winter
plotx <- plotx + geom_pointrange(data = testdata[testdata$Month == "December" | testdata$Month == "January",], size = 1, position=position_dodge(width=0.6))
print(plotx)
Another way to think about "dodge" is as an offset from the x-values based on group (in this case Month). So if we add a dodge (x-offset) column to your original data, based on month:
# your original sample data
# note the use of set.seed(...) so "random" data is reproducible
set.seed(1)
hour <- rep(seq(from=1,to=24,by=1),4)
avg_hou <- sample(seq(0,0.5,0.001),96,replace=TRUE)
lower_ci <- avg_hou - sample(seq(0,0.05,0.001),96,replace=TRUE)
upper_ci <- avg_hou + sample(seq(0,0.05,0.001),96,replace=TRUE)
Month <- c(rep("December",24), rep("January",24), rep("June",24), rep("July",24))
testdata <- data.frame(Month,hour,avg_hou,lower_ci,upper_ci)
testdata$Month <- factor(testdata$Month,levels=c("June", "July", "December","January"))
# add offset column for dodge
testdata$dodge <- -2.5+(as.integer(testdata$Month))
# create ggplot object and default mappings
ggp <- ggplot(testdata, aes(x=hour, y = avg_hou, ymin = lower_ci, ymax = upper_ci, color = Month, shape = Month))
ggp <- ggp + scale_color_manual(values = c("June" = "#FDB863", "July" = "#E66101", "December" = "#92C5DE", "January" = "#0571B0"))
# plot the point range
ggp + geom_pointrange(aes(x=hour+0.2*dodge), size=1)
Produces this:
This does not require geom_blank(...)
to maintain the scale order, and it does not require two calls to geom_pointrange(...)
One possibility is to add a geom_blank
as a first layer in the plot. From ?geom_blank
: "The blank geom draws nothing, but can be a useful way of ensuring common scales between different plots.". We tell the geom_blank
layer to use the entire data set. This layer thus sets up a scale which includes all levels of 'Month', correctly ordered. Then add the two layers of geom_pointrange
, which each uses a subset of the data.
Perhaps a matter of taste in this particular case, but I tend to prefer to prepare the data sets before I use them in ggplot
.
df_sum <- testdata[testdata$Month %in% c("June", "July"), ]
df_win <- testdata[testdata$Month %in% c("December", "January"), ]
ggplot(data = testdata, aes(x = hour, y = avg_hou, ymin = lower_ci, ymax = upper_ci,
color = Month, shape = Month)) +
geom_blank() +
geom_pointrange(data = df_sum, size = 1, position = position_dodge(width = 0.3)) +
geom_pointrange(data = df_win, size = 1, position = position_dodge(width = 0.6)) +
scale_color_manual(values = c("June" = "#FDB863", "July" = "#E66101",
"December" = "#92C5DE", "January" = "#0571B0"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With