I have this chart that I'm attempting to replicate. It has two continuous variables for the X & Y axes, and charts the relationship between these two variables through time with a line.
My question has two parts:
First, what is this type of chart called? It is unusual because the line between the points is determined by a third variable (the year) rather than their positions on the X axis.
Second, does anyone know whether this can be achieved with ggplot? I have so far created a chart similar to the above but without the line connecting the points. This code
ggplot(data, aes(x = Weekly_Hours_Per_Person, y = GDP_Per_Hour)) + geom_point()
has gotten the below output:
But how to get the line over years?
Any help on either point will be appreciated. Thanks!
ggplot2 is a plotting package that provides helpful commands to create complex plots from data in a data frame. It provides a more programmatic interface for specifying what variables to plot, how they are displayed, and general visual properties.
Scatter plots A scatter plot is one of the simplest representations of a bivariate distribution. Scatter plots are simple to create in ggplot2 by specifying the appropriate X and Y variables in the aesthetic mapping and using geom_point for the geometric mapping.
ggplot() initializes a ggplot object. It can be used to declare the input data frame for a graphic and to specify the set of plot aesthetics intended to be common throughout all subsequent layers unless specifically overridden.
At present, ggplot2 cannot be used to create 3D graphs or mosaic plots. Use I (value) to indicate a specific value. For example size=z makes the size of the plotted points or lines proporational to the values of a variable z. In contrast, size=I (3) sets each point or line to three times the default size.
According to the ggplot2 concept, a plot can be divided into different fundamental parts: Plot = data + Aesthetics + Geometry data: a data frame aesthetics: used to indicate the x and y variables. geometry: corresponds to the type of graphics (histogram, box plot, line plot, ….)
Here are some examples using automotive data (car mileage, weight, number of gears, number of cylinders, etc.) contained in the mtcars data frame. Unlike base R graphs, the ggplot2 graphs are not effected by many of the options set in the par ( ) function.
Graphics with ggplot2. The ggplot2 package, created by Hadley Wickham, offers a powerful graphics language for creating elegant and complex plots. Its popularity in the R community has exploded in recent years.
Use geom_path
, i.e.
libraray(ggplot2)
ggplot(data, aes(x = Weekly_Hours_Per_Person, y = GDP_Per_Hour)) +
geom_point() +
geom_path()
Just expanding upon the original question. This is a path graph, and as explained here:
"geom_path() connects the observations in the order in which they appear in the data. geom_line() connects them in order of the variable on the x axis."
As an extension of your original question, you can label selected points where the line bends. Here is an example with reproducible data:
set.seed(123)
df <- data.frame(year = 1960:2006,
Weekly_Hours_Per_Person = c(2:10, 9:0, 1:10, 9:1, 2:10),
GDP_Per_Hour = 1:47 + rnorm(n = 47, mean = 0))
# Only label selected years
df_label <- filter(df, year %in% c(1960, 1968, 1978, 1988, 1997, 2006))
And use the ggrepel
package to offset the labels from the vertices.
library(ggrepel)
ggplot(df, aes(Weekly_Hours_Per_Person, GDP_Per_Hour)) +
geom_path() +
geom_point(data = df_label) +
geom_text_repel(data = df_label, aes(label = year)) +
scale_x_continuous(limits = c(-2, 12))
))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With