So I was working with panel data for states and localities and uncovered a strange issue plotting the time-series. I was trying to plot each state's data individually in light grey, highlight key states using specific colors, and add a colored label at the end of the plot for the states that I highlighted. I also wanted to include a line for the mean across states. For some reason, the scaling of the variable in question throws the labeling off.
I generated some clunky data below that demonstrates the problem. The labels for the average, for some reason, go haywire with some variables. Any help in this regard would be really useful. I'm just curious why the code works perfectly fine with one variable and not the other. There is no difference between the two sets of code otherwise.
library(tidyverse)
#Creating state labels
state<-c(rep("A",21), rep("B",21), rep("C",21), rep("D",21))
#Creating years for each state
year<-rep(2000:2020, 4)
#Generating each state's population
population_a<-5000:5020
population_b<-population_a+10
population_c<-population_a+20
population_d<-population_a+30
population<-c(population_a, population_b, population_c, population_d)
#Consolidating the data
mydata<-data.frame(state, year, population)
mydata$lnpop<-log(mydata$population)
#PLOTTING TIME-SERIES FOR EACH STATE
#THIS WORKS:
ggplot(data=mydata, aes(year, lnpop)) +
geom_line(aes(group=state), colour="gray")+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="A"),
aes(x = year+0.3, label=state), colour="purple", hjust=0)+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="B"),
aes(x = year+0.3, label=state), colour="red",hjust=0)+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="D"),
aes(x = year+0.3, label=state), colour="blue",hjust=0)+
guides(colour=FALSE) +
expand_limits(x = max(mydata$year) + 0.3)+
geom_line(data=subset(mydata, state == "A"), colour="purple")+
geom_line(data=subset(mydata, state == "B"), colour="red")+
geom_line(data=subset(mydata, state == "D"), colour="blue")+
stat_summary(fun = mean, geom = "line") +
stat_summary(data=subset(mydata, year==max(year)), fun = mean, geom = "text", show.legend = FALSE, hjust=0, aes(x=year+0.05,label="AVG")) +
xlab("Year")+
ylab("Population (Logged)")
#BUT THIS DOES NOT:
ggplot(data=mydata, aes(year, population)) +
geom_line(aes(group=state), colour="gray")+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="A"),
aes(x = year+0.3, label=state), colour="purple", hjust=0)+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="B"),
aes(x = year+0.3, label=state), colour="red",hjust=0)+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="D"),
aes(x = year+0.3, label=state), colour="blue",hjust=0)+
guides(colour=FALSE) +
expand_limits(x = max(mydata$year) + 0.3)+
geom_line(data=subset(mydata, state == "A"), colour="purple")+
geom_line(data=subset(mydata, state == "B"), colour="red")+
geom_line(data=subset(mydata, state == "D"), colour="blue")+
stat_summary(fun = mean, geom = "line") +
stat_summary(data=subset(mydata, year==max(year)), fun = mean, geom = "text", show.legend = FALSE, hjust=0, aes(x=year+0.05,label="AVG")) +
xlab("Year")+
ylab("Population")

--

EDIT: Spaced out the lines in the plots a bit.
Another workaround using annotate()
library(ggplot2)
library(dplyr)
state<-c(rep("A",21), rep("B",21), rep("C",21), rep("D",21))
#Creating years for each state
year<-rep(2000:2020, 4)
#Generating each state's population
population_a<-5000:5020
population_b<-population_a+2
population_c<-population_a+3
population_d<-population_a+5
population<-c(population_a, population_b, population_c, population_d)
#Consolidating the data
mydata<-data.frame(state, year, population)
sub_dat <- subset(mydata, year==max(year))
ggplot(data=mydata, aes(year, population)) +
geom_line(aes(group=state), colour="gray")+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="A"),
aes(x = year+0.3, label=state), colour="purple", hjust=0)+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="B"),
aes(x = year+0.3, label=state), colour="red",hjust=0)+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="D"),
aes(x = year+0.3, label=state), colour="blue",hjust=0)+
guides(colour=FALSE) +
expand_limits(x = max(mydata$year) + 0.3)+
geom_line(data=subset(mydata, state == "A"), colour="purple")+
geom_line(data=subset(mydata, state == "B"), colour="red")+
geom_line(data=subset(mydata, state == "D"), colour="blue")+
stat_summary(fun = mean, geom = "line") +
annotate("text",
x = max(sub_dat$year) + 0.05, y = mean(sub_dat$population),
label = "AVG", hjust = 0) +
xlab("Year")+
ylab("Population")

Created on 2020-04-16 by the reprex package (v0.3.0)
or set the argument orientation = x in stat_summary() explicitly
This geom treats each axis differently and, thus, can thus have two orientations. Often the orientation is easy to deduce from a combination of the given mappings and the types of positional scales in use. Thus, ggplot2 will by default try to guess which orientation the layer should have. Under rare circumstances, the orientation is ambiguous and guessing may fail. In that case the orientation can be specified directly using the orientation parameter, which can be either "x" or "y". The value gives the axis that the geom should run along, "x" being the default orientation you would expect for the geom.
ggplot(data=mydata, aes(year, population)) +
geom_line(aes(group=state), colour="gray")+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="A"),
aes(x = year+0.3, label=state), colour="purple", hjust=0)+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="B"),
aes(x = year+0.3, label=state), colour="red",hjust=0)+
geom_text(data=mydata %>% group_by(state) %>%
arrange(desc(year)) %>%
slice(1) %>%
filter(state=="D"),
aes(x = year+0.3, label=state), colour="blue",hjust=0)+
guides(colour=FALSE) +
expand_limits(x = max(mydata$year) + 0.3)+
geom_line(data=subset(mydata, state == "A"), colour="purple")+
geom_line(data=subset(mydata, state == "B"), colour="red")+
geom_line(data=subset(mydata, state == "D"), colour="blue")+
stat_summary(fun = mean, geom = "line") +
stat_summary(data=subset(mydata, year==max(year)), fun = mean, geom = "text", show.legend = FALSE, hjust=0, aes(x=year+0.05,label="AVG"), orientation = "x") +
xlab("Year")+
ylab("Population (Logged)")
Created on 2020-04-16 by the reprex package (v0.3.0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With