Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fill histogram with color gradient?

I have a simple problem. How to plot histogram with ggplot2 with fixed binwidth and filled with rainbow colors (or any other palette)?

Lets say I have a data like that:

myData <- abs(rnorm(1000))

I want to plot histogram, using e.g. binwidth=.1. That however will cause different number of bins, depending on data:

ggplot() + geom_histogram(aes(x = myData), binwidth=.1) 

enter image description here

If I knew number of bins (e.g. n=15) I'd use something like:

ggplot() + geom_histogram(aes(x = myData), binwidth=.1, fill=rainbow(n))

But with changing number of bins I'm kind of stuck on this simple problem.

like image 932
Art Avatar asked Dec 18 '22 12:12

Art


2 Answers

If you really want the number of bins flexible, here is my little workaround:

library(ggplot2)

gg_b <- ggplot_build(
  ggplot() + geom_histogram(aes(x = myData), binwidth=.1)
)

nu_bins <- dim(gg_b$data[[1]])[1]

ggplot() + geom_histogram(aes(x = myData), binwidth=.1, fill = rainbow(nu_bins))

enter image description here

like image 186
J_F Avatar answered Jan 03 '23 19:01

J_F


In case the binwidth is fixed, here is an alternative solution which is using the internal function ggplot2:::bin_breaks_width() to get the number of bins before creating the graph. It's still a workaround but avoids to call geom_histogram() twice as in the other solution:

# create sample data
set.seed(1L)
myData <- abs(rnorm(1000))
binwidth <- 0.1

# create plot    
library(ggplot2)   # CRAN version 2.2.1 used
n_bins <- length(ggplot2:::bin_breaks_width(range(myData), width = binwidth)$breaks) - 1L
ggplot() + geom_histogram(aes(x = myData), binwidth = binwidth, fill = rainbow(n_bins)) 

enter image description here


As a third alternative, the aggregation can be done outside of ggplot2. Then, geom_col() cam be used instead of geom_histogram():

# start binning on multiple of binwidth
start_bin <- binwidth * floor(min(myData) / binwidth)
# compute breaks and bin the data
breaks <- seq(start_bin, max(myData) + binwidth, by = binwidth)
myData2 <- cut(sort(myData), breaks = breaks, by = binwidth)

ggplot() + geom_col(aes(x = head(breaks, -1L), 
                        y = as.integer(table(myData2)), 
                        fill = levels(myData2))) + 
  ylab("count") + xlab("myData")

enter image description here

Note that breaks is plotted on the x-axis instead of levels(myData2) to keep the x-axis continuous. Otherwise each factor label would be plotted which would clutter the x-axis. Also note that the built-in ggplot2 color palette is used instead of rainbow().

like image 36
Uwe Avatar answered Jan 03 '23 19:01

Uwe