Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generating a consistent dynamic color palette in ggplot within a loop?

Tags:

r

ggplot2

I have a data set called d1 similar to:

location, depth.from, depth.to, val, type 

I have a loop that creates a fairly complex plot for each unique location (it's glueing together many things using grid.arrange, which is why I can't use facet_wrap on the location to keep the legend/color consistent of one part of the plot).

Say there are 4 categories for "type", the problem when one location has a different number of "types" than the other, the colors assigned are not consistent between each plot. I can manually force them to be the same, but I am trying to generalize this function. Google has failed me.

For the following block, d1 is a subset of the data based on the location type, e.g.

d1 <- subset(myData, location == location.list[i])

Looking at the plot, which is within the loop:

p1 <- ggplot(data = d1, aes (y=val, x=depth.from))+
layer(geom = "point", size = 2) + 
geom_rect(data=d1, aes(xmin=Depth.to, xmax=Depth.from, ymin=0, ymax=100, fill = type), linetype =0, alpha=0.3)+
scale_fill_brewer(palette="Set1")

the geom_rect command is going through the data and based on the depth from and depth to, creating a overlay based on fill type. I can use scale_fill_manual("Lith", c("Val1" = "DodgerBlue4"...) etc to manually set it, but that defeats the purposes. If I have:types like where I want something like:

Bird_one = blue
Bird_two = red
Bird_three = green

I want bird_three to be green, even if bird_two doesn't exist, without having to explicitly set it using scale_fill_manual. Is there a way to set a global list of names for the color palette? Perhaps by providing the array from something like:

myData <- read.csv("mydata.csv" 
typeList <- unique(myData$type)
like image 479
mat4nier Avatar asked Jan 24 '14 01:01

mat4nier


2 Answers

Pretty late, but there actually is a trivial solution by setting scale_fill_discrete(drop=F)

plots <- lapply(dfs, function(df) {
  ggplot(df, 
    aes(
      x=location, fill=type, 
      y=(depth.from + depth.to) / 2, 
      ymin=depth.from, ymax=depth.to
  ) ) +
  geom_crossbar() + scale_fill_discrete(drop=F)
})
library(gridExtra)
do.call(grid.arrange, plots)

enter image description here

And here is dummy data I used:

set.seed(12)
items <- 2
dfs <- 
  replicate(2, simplify=F,
    data.frame(
      location=sample(letters, items), 
      depth.from=runif(items, -10, -5),
      depth.to=runif(items, 5, 10),
      val=runif(items),
      type=factor(
        sample(c("Bird_one", "Bird_two", "Bird_three"), items),
        levels=c("Bird_one", "Bird_two", "Bird_three")
  ) ) )
like image 176
BrodieG Avatar answered Sep 19 '22 08:09

BrodieG


It is not important whether or not they are in a loop, you just need to associate each level with a colour. In your case:

colourList <- c(bird_one = "red", bird_two = "blue", bird_three = "green")

In a simple example:

#Make some data
dat <- data.frame(location = rep(1:4, c(3,2,2,3)), val = rnorm(10), 
  depth.from = sample(1:5, 10, replace = TRUE), depth.to = sample(6:10, 10, replace = TRUE),
  type = factor(LETTERS[c(1:3, 1,3,1,3,1:3)]))

#Associate levels with colours
colourList <- c(A = "red", B = "blue", C = "green")

p <- list()
for(i in 1:4) {
  d <- dat[dat$location == i,]
  p[[i]] <- ggplot(data = d, aes (y=val, x=depth.from))+
    layer(geom = "point", size = 2) + 
    geom_rect(aes(xmin=depth.to, xmax=depth.from, ymin=0, ymax=100, fill = type), linetype =0, alpha=0.3) +
    #This is where the assignment works
    scale_fill_manual(values=colourList)
}
grid.arrange(p[[1]], p[[2]])

You can see that level C is green in both plots.

In response to @BrodieG, here is a way to set the colours semi-automatically. It is creating the named vector using the levels of type and colour values from the RColorBrewer package. This could be developed pretty easily to include in a function:

library(RColorBrewer)
colourList <- setNames(brewer.pal(length(levels(dat$type)), "Set1"), levels(dat$type))

As @hadley points out, in this instance it's even more straightforward to set the limits of the scale, although in my typical use I find it more useful to setup an object like colourList which can be used across multiple plots by just setting values. Setting limits also maintains the levels for the legend, which may or may not be what is wanted:

scale_fill_brewer(limits = levels(dat$type), palette = "Set1")

enter image description here

As @hadley points out, in this instance it's even more straightforward to set the limits of the scale, although in my typical use I find it more useful to setup an object like colourList which can be used across multiple plots by just setting values. Setting limits also maintains the levels for the legend, which may or may not be what is wanted:

scale_fill_brewer(limits = levels(dat$type), palette = "Set1")

enter image description here

like image 44
alexwhan Avatar answered Sep 17 '22 08:09

alexwhan