I plot several densities of values corresponding to categories of a discrete variable. I can associate to each density a specific color or a gradient of color for all of them at the same time. Now I would like to add a specific gradient for each density with varying values.
Here is a reproducible example using ggridges
:
data(iris)
library(ggplot2)
library(ggridges)
library(RColorBrewer)
cols <- brewer.pal(3, "BrBG")
# Plot with one color per group
ggplot(iris, aes(Sepal.Length, as.factor(Species))) +
geom_density_ridges(aes(fill = as.factor(Species))) +
scale_fill_manual("Sepal", values = cols)
# Plot with one gradient
ggplot(iris, aes(Sepal.Length, as.factor(Species))) +
geom_density_ridges_gradient(aes(fill = ..x..)) +
scale_fill_gradient2(low = "grey", high = cols[1], midpoint = 5)
I basically want to combine both plots. I am also interested in having a specific midpoint
value for each density.
I came up with the workaround below as a curiosity, but I don't think it's really good practice, as far as data visualisation goes. Having a single varying gradient in a density chart is shaky enough; having multiple different ones won't be any better. Please don't use it.
Preparation:
ggplot(iris, aes(Sepal.Length, as.factor(Species))) +
geom_density_ridges_gradient()
# plot normally & read off the joint bandwidth from the console message (0.181 in this case)
# split data based on the group variable, & define desired gradient colours / midpoints
# in the same sequential order.
split.data <- split(iris, iris$Species)
split.grad.low <- c("blue", "red", "yellow") # for illustration; please use prettier colours
split.grad.high <- cols
split.grad.midpt <- c(4.5, 6.5, 7) # for illustration; please use more sensible points
# create a separate plot for each group of data, specifying the joint bandwidth from the
# full chart.
split.plot <- lapply(seq_along(split.data),
function(i) ggplot(split.data[[i]], aes(Sepal.Length, Species)) +
geom_density_ridges_gradient(aes(fill = ..x..),
bandwidth = 0.181) +
scale_fill_gradient2(low = split.grad.low[i], high = split.grad.high[i],
midpoint = split.grad.midpt[i]))
Plot:
# Use layer_data() on each plot to get the calculated values for x / y / fill / etc,,
# & create two geom layers from each, one for the gradient fill & one for the ridgeline
# on top. Add them to a new ggplot() object in reversed order, because we want the last
# group to be at the bottom, overlaid by the others where applicable.
ggplot() +
lapply(rev(seq_along(split.data)),
function(i) layer_data(split.plot[[i]]) %>%
mutate(xmin = x, xmax = lead(x), ymin = ymin + i - 1, ymax = ymax + i - 1) %>%
select(xmin, xmax, ymin, ymax, height, fill) %>%
mutate(sequence = i) %>%
na.omit() %>%
{list(geom_rect(data = .,
aes(xmin = xmin, xmax = xmax, ymin = ymin, ymax = ymax, fill = fill)),
geom_line(data = .,
aes(x = xmin, y = ymax)))}) +
# Label the y-axis labels based on the original data's group variable
scale_y_continuous(breaks = seq_along(split.data), labels = names(split.data)) +
# Use scale_fill_identity, since all the fill values have already been calculated.
scale_fill_identity() +
labs(x = "Sepal Length", y = "Species")
Note that this method won't create a fill legend. If desired, it's possible to retrieve the fill legends from the respective plots in split.plot
via get_legend
& add them to the plot above via plot_grid
(both functions from the cowplot
package), but that's like adding frills to an already weird visualisation choice...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With