I'm making a several histograms using ggplot2
and facet_wrap
and would like to plot the mean value on each panel. Below, I create a dummy data frame, find the mean of each facet, and then create the plots adding the mean using geom_point
.
# Load libraries
library(tidyverse)
# Toy data frame
df <- data.frame(ID = sample(letters[1:3], 100, replace = TRUE), n = runif(100))
# Mean value of each group
df_mean <- df %>% group_by(ID) %>% summarise(mean = mean(n))
# Plot histograms
ggplot(df) +
geom_histogram(aes(n)) +
facet_wrap(~ID) +
geom_point(data = df_mean, aes(x = mean, y = Inf))
I used y = Inf
to place the point at the top of each facet, but – as you can see – it is cropped somewhat. I'd like to nudge it downwards so that it is completely visible. To my knowledge, geom_point
doesn't have a nudge_y
or vadj
argument and 0.7 * Inf
is obviously nonsensical. I also tried adding position = position_nudge(y = -5)
as an argument to geom_point
, but this doesn't appear to have any effect. As a workaround, I even tried using geom_text
and specifying nudge_y
, but – like the position_nudge
solution – it did not have any noticeable effect. Is there an easy way of doing this whilst plotting or do I simply need to calculate the y
value prior to plotting?
The facet_grid() function will produce a grid of plots for each combination of variables that you specify, even if some plots are empty. The facet_wrap() function will only produce plots for the combinations of variables that have values, which means it won't produce any empty plots.
Note that you can add as many (categorical) variables as you'd like in your facet wrap, however, this will result in a longer loading period for R.
facet_wrap() makes a long ribbon of panels (generated by any number of variables) and wraps it into 2d. This is useful if you have a single variable with many levels and want to arrange the plots in a more space efficient manner. You can control how the ribbon is wrapped into a grid with ncol , nrow , as.
Facet plots, also known as trellis plots or small multiples, are figures made up of multiple subplots which have the same set of axes, where each subplot shows a subset of the data.
If you are ok with using geom_text/label()
you can use the vjust
argument to do this:
ggplot(df) +
geom_histogram(aes(n)) +
facet_wrap(~ID) +
geom_text(data = df_mean, aes(x = mean, y = Inf),
label = "Mean", vjust = 1)
I use it all the time for floating percent change or p-values at the top of a panel and you don't have to calculate anything, ggplot
has got you.
# Load libraries
library(tidyverse)
# Toy data frame
df <- data.frame(ID = sample(letters[1:3], 100, replace = TRUE), n = runif(100))
# Mean value of each group
df_mean <- df %>% group_by(ID) %>% summarise(mean = mean(n))
# Get max count using the dataframe that stores ggplot info
ggplot(df) +
geom_histogram(aes(n)) +
facet_wrap(~ID) -> p
# Plot histograms and plot mean in the right place
p + geom_point(data = df_mean, aes(x = mean, y = max(ggplot_build(p)$data[[1]]$count)))
The key here is to know the maximum count value, because that will be your top y axis value for your histograms. You can get that info using ggplot_build
function and use that to plot your points in the right place.
Of course, you can go a bit higher than the max count in case the point falls on one of the bars, like this y = 0.2 + max(ggplot_build(p)$data[[1]]$count))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With