The geom_hex
geometry in ggplot2
colors hexagonal bins according to the number of points falling within them. This works pretty well for uniformly distributed data, but not so well if some regions are way more dense than others-- differences can get drowned out by the presence of a single very dense hexagon.
How can I make the density color scale use a log scale or some other kind of normalizing transformation?
ggplot 3.0+ demystifies the calculation of summary metrics via the new stat()
internal function. This makes it easier to modify the statistic being used to create the fill for the hexes. So for example:
df <- data.frame(
x = rnorm(1000),
y = rnorm(1000)
)
plot.df <- ggplot(data = df, aes(x = x, y = y)) +
geom_hex(aes(fill = stat(count)))
print(plot.df)
plot.df.log <- ggplot(data = df, aes(x = x, y = y)) +
geom_hex(aes(fill = stat(log(count))))
print(plot.df.log)
In place of log
, you could do any arbitrary transformation you want, like cube root, etc.
cut
To avoid creating a scale with confusing values, you could use cut
to establish sensible category boundaries, and convert these to a numeric scale which is labeled with the original count values:
plot.df.log.cut <- ggplot(data = df, aes(x = x, y = y)) +
geom_hex(aes(fill = stat(cut(log(count), breaks = log(c(0, 1, 2, 4, Inf)), labels = F, right = T, include.lowest = T)))) +
scale_fill_continuous(name = 'count', labels = c('1', '2', '4', '8+'))
print(plot.df.log.cut)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With