I want to create a plot (preferable using ggplot2
) where I visualize a timeline together with a time-trend plot.
To put it in a practical example, I have aggregated unemployment rates for each year. I also have a data set denoting important legislation changes that are related to the labor market. Hence, I want to create a timeline where the unemployment rate is shown following the same x-axis (time).
I have generated some toy-data, see code below:
set.seed(2110)
year <- c(1950:2020)
unemployment <- rnorm(length(year), 0.05, 0.005)
un_emp <- data.frame(cbind(year, unemployment))
year <- c( 1957, 1961, 1975, 1976, 1983, 1985, 1995, 1999, 2011, 2018)
events <- c("Implemented unemployment benefit",
"Pre-school became free",
"Five-day workweek were introduced",
"Labor law reform 1976",
"Unemployment benefit were cut in half",
"Apprenticeship Act allows on-the-job training",
"Changes in discrimination law",
"Equal Pay for Equal Work was",
"9 weeks vacation were introduced",
"Unemployment benefit were removed")
imp_event <- data.frame(year, events)
I can easily plot the time-trend across the years:
library(tidyverse)
ggplot(data = un_emp, aes(x = year, y = unemployment)) +
geom_line(color = "#FC4E07", size = 0.5) +
theme_bw()
But how do I include the events (found in imp_event
) in the plot in a nice and efficient way? How can I do this?
My aim is to make a timeline looking like the one from here but to combine it with the time-trend plot shown above. How can I do this?
I have tried to use vline
but I cannot add the label of the event.
Thanks!
Here’s the data in a two-series timeline chart: To add a trendline to a series, right click on it and select Add Trendline. Here’s the chart with a trendline for each series.
The solution generally entails grouping the data by the desired time period, then grouping the data again by sub-category. After grouping the data, use the Graph Objects library and a second add trace with a for-loop. Then, within each loop generate data and plot data for a regression line.
In Brief: Create time series plots with regression trend lines by leveraging Pandas Groupby (), for-loops, and Plotly Scatter Graph Objects in combination with Plotly Express Trend Lines. Data: Counts of things or different groups of things by time.
Change over time is progressive, and this is something you must show in your charts. Therefore, as you consider plotting data, decide on the order your chart bars will follow. Usually, data analysts prefer to have the longest bar at the beginning with the shortest one at the end.
I think this should do the trick:
First, I created the axis with hline, using the mean you set for the data as the y intercept. Then I added a variable "height" to the events' dataframe, which takes the value of the axis and adds a value drawn from a normal distribution. I used this to draw the segments that create the lines towards each point. Finally, I inverted the y position of the year label so it's always in the opposite side of the segment.
library(tidyverse)
set.seed(2110)
year <- c(1950:2020)
unemployment <- rnorm(length(year), 0.05, 0.005)
un_emp <- data.frame(cbind(year, unemployment))
year <- c( 1957, 1961, 1975, 1976, 1983, 1985, 1995, 1999, 2011, 2018)
events <- c("Implemented unemployment benefit",
"Pre-school became free",
"Five-day workweek were introduced",
"Labor law reform 1976",
"Unemployment benefit were cut in half",
"Apprenticeship Act allows on-the-job training",
"Changes in discrimination law",
"Equal Pay for Equal Work was",
"9 weeks vacation were introduced",
"Unemployment benefit were removed")
imp_event <- data.frame(year, events) %>%
mutate(height = mean(unemployment) + rnorm(n(), 0, 0.02))
ggplot(un_emp) +
geom_hline(yintercept = 0.05) +
geom_line(aes(x = year,
y = unemployment),
color = "red",
alpha = 0.3,
size = 1) +
geom_segment(data = imp_event,
aes(x = year,
xend = year,
y = 0.05,
yend = height)) +
geom_text(data = imp_event,
aes(label = year,
x = year,
y = 0.05 + 0.002 * sign(0.05 - height)),
angle = 90,
size = 3.5,
fontface = "bold",
check_overlap = T) +
geom_point(data = imp_event,
aes(x = year,
y = height,
fill = as.factor(events)),
shape = 21,
size = 4) +
scale_x_continuous(name = NULL,
labels = NULL) +
scale_fill_discrete(name = "Event") +
scale_y_continuous(name = "Unemployment Rate") +
theme_bw() +
theme(panel.border = element_blank(),
axis.line.y = element_line(),
axis.ticks.x = element_blank(),
panel.grid = element_blank(),
legend.position="bottom")
I worked with Jon Spring's solution but replaced geom_segment
with geom_vline
which gave a result close to what I wanted. The final code looked like this:
joined_data <- un_emp %>% left_join(imp_event, by = "year")
ggplot(data = joined_data, aes(x = year, y = unemployment)) +
geom_line(color = "red", size = 0.5) +
theme_classic() +
labs(y = "Unemployment rate",
x = "Years",
caption = "Data from XXXX") +
geom_vline(data = joined_data %>% filter(!is.na(events)), aes(xintercept = year), color = "gray70", linetype = "dashed") +
ggrepel::geom_text_repel(data = joined_data, aes(x = year, y = unemployment-0.03, label = str_wrap(events, 10)), color = "gray70", direction = "y", size = 2.5, lineheight = 0.7, point.padding = 0.8)
Which produces the following plot:
I want to reward @Jon Spring the bounty but not sure how I reward a comment.
You can achieve this by overlaying a geom_text()
call, but that requires the x
and y
values to be the same length as in the other plot so you can't just feed it a new df and overlay that.
Instead, you can achieve what you want by doing a left_join
from un_emp
to imp_events
on year
. Because there is only one row per year in imp_events
you'll be left with a majority of missing values for events
in the df which is perfect as I suspect you only want each event to appear as a label once.
For example:
joined_data <- un_emp %>% left_join(imp_event, by = "year")
ggplot(data = joined_data, aes(x = year, y = unemployment)) +
geom_line(color = "#FC4E07", size = 0.5) +
geom_text(data = joined_data, aes(x = year, y = unemployment, label = (events), size = 3)) +
theme_bw()
Which gives you something like this:
You can have a look at the available options and play around with geom_text()
here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With