Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add color to boxplot - "Continuous value supplied to discrete scale" error

There is probably a very easy solution to my problem but I couldn't find a satisfying answer online.

Using the following command I was able to create the following boxplot graph and overlay it with the individual data points:

ggplot(data = MYdata, aes(x = Age, y = Richness)) +    geom_boxplot(aes(group=Age)) +    geom_point(aes(color = Age)) 

There are several things I would like to add/change:

1. Change the line color and/or fill of each boxplot (depending on "Age") using 6 different colors from left to right:

c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00") 

I tried

ggplot(data = MYdata, aes(Age, Richness)) +    geom_boxplot(aes(group=Age)) +    scale_colour_manual(values = c("#E69F00", "#56B4E9", "#009E73",                                   "#F0E442", "#0072B2", "#D55E00"))  

but it results in a "Continuous value supplied to discrete scale" error.

2. Change the color of each data point (depending on "Age") using 6 different colors from left to right:

c("#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00") 

I tried:

ggplot(data = MYdata, aes(Age, Richness)) +    geom_boxplot(aes(group=Age)) +    geom_point(aes(color = Age)) +    scale_colour_manual(values = c("#E69F00", "#56B4E9", "#009E73",                                   "#F0E442", "#0072B2", "#D55E00"))  

but it also results in an error:

Continuous value supplied to discrete scale

3. Change the text in the legend to "0 month", "1 month", "3 months", "6 months", "9 months", "12 months"

like image 919
Dalmuti71 Avatar asked May 29 '12 19:05

Dalmuti71


People also ask

How do you fix continuous value supplied to discrete scale?

– Convert the Grouping Variable To Fix Error: Continuous Value Supplied To Discrete Scale. All that you need to do is to convert the grouping variable by passing the same to the factor() function.

What does error discrete value supplied to continuous scale mean?

The “discrete value supplied to continuous scale” error message is more of a minor nuisance than a serious problem. It is simply a matter of using the wrong vector from the data frame. Correcting that one continuous data mistake in your plotting code solves the problem in a simple and easy manner.


1 Answers

First, providing sample data would help. Since you didn't, here is some:

MYdata <- data.frame(Age = rep(c(0,1,3,6,9,12), each=20),                     Richness = rnorm(120, 10000, 2500)) 

Parts 1 and 2 stem from the same problem. Age is a continuous variable, but you are trying to use it in a discrete scale (by specifying the color for specific values of age). In general, a scale maps the variable to the visual; for a continuous age, there is a corresponding color for every possible value of age, not just the ones that happen to appear in your data. However, you can simultaneously treat age as a categorical variable (factor) for some of the aesthetics. For the third part of your question, within the scale description, you can define specific labels corresponding to specific breaks in the scale. Putting this all together (and adding something to give you the x axis labelled more like what you have in the example):

ggplot(data = MYdata, aes(x = Age, y = Richness)) +    geom_boxplot(aes(fill=factor(Age))) +    geom_point(aes(color = factor(Age))) +   scale_x_continuous(breaks = c(0, 1, 3, 6, 9, 12)) +   scale_colour_manual(breaks = c("0", "1", "3", "6", "9", "12"),                       labels = c("0 month", "1 month", "3 months",                                  "6 months", "9 months", "12 months"),                       values = c("#E69F00", "#56B4E9", "#009E73",                                   "#F0E442", "#0072B2", "#D55E00")) +   scale_fill_manual(breaks = c("0", "1", "3", "6", "9", "12"),                       labels = c("0 month", "1 month", "3 months",                                  "6 months", "9 months", "12 months"),                       values = c("#E69F00", "#56B4E9", "#009E73",                                   "#F0E442", "#0072B2", "#D55E00")) 

enter image description here

With this color scheme, the points that fall inside the boxplot are not visible (since they are the same color as the boxplot's fill). Perhaps leaving the boxplot hollow and drawing its lines in the color would be better.

ggplot(data = MYdata, aes(x = Age, y = Richness)) +    geom_boxplot(aes(colour=factor(Age)), fill=NA) +    geom_point(aes(color = factor(Age))) +   scale_x_continuous(breaks = c(0, 1, 3, 6, 9, 12)) +   scale_colour_manual(breaks = c("0", "1", "3", "6", "9", "12"),                       labels = c("0 month", "1 month", "3 months",                                  "6 months", "9 months", "12 months"),                       values = c("#E69F00", "#56B4E9", "#009E73",                                   "#F0E442", "#0072B2", "#D55E00")) 

enter image description here

Finally, consider if you really need to color each age differently, since they are well defined by the x-axis already.

like image 156
Brian Diggs Avatar answered Sep 19 '22 12:09

Brian Diggs