For now, I'm just using something like this:
test_data$level <- rep("", nrow(test_data))
test_data[test_data$value <= 1, ]$level <- "1"
test_data[test_data$value > 1 & test_data$value <= 2, ]$level <- "2"
...
test_data[test_data$value > 4 & test_data$value <= 5, ]$level <- "5"
Just wondering if there's a better way to do this in R, or a way to simply apply some scale
argument via ggplot2
to do the categorizing.
There could be a couple of approaches to this, so it was hard to phrase my question exactly. Here's the gist... I have data something like so:
set.seed(123)
test_data <- data.frame(var1 = rep(LETTERS[1:3], each = 5),
var2 = rep(letters[1:5], 3),
value = runif(30, 1, 5))
test_data
var1 value
1 A 2.150310
2 B 4.153221
3 C 2.635908
4 D 4.532070
5 E 4.761869
6 F 1.182226
7 G 3.112422
8 H 4.569676
9 I 3.205740
10 J 2.826459
I have a lot more data points, and am plotting something like this:
library(ggplot2)
p <- ggplot(test_data, aes(x = var1, y = var2, colour = value))
p <- p + geom_jitter(position = position_jitter(width = 0.1, heigh = 0.1))
p
Which gives something like so:
My actual data is from a subjective evaluation with 1-5 ratings, but I've bundled similar questions together and averaged them together so they're no longer integers.
I'm plotting the ratings per factor combination to visualize which combinations yielded higher ratings. The default continuous scale doesn't really "pop" and I'd like to get the color scale to treat "bins" of these values (0-1, 1-2, ... 4-5) to be colored like scale_colour_discrete
does for factors.
So, my question(s):
1) Is it possible with ggplot2 to "bin" these somehow via scale_colour_continuous
so I can get the default factor level coloring scheme to apply even though this is continuous data?
2) If not, is there an easier way to create a new vector where I substitute numbers/letters for my values based on criteria? I'm a bit of an R novice, so I wasn't sure except a bunch of if()
or conditional statements (test_data[test_data > 0 & test_data < 1, "values"] <- "a"
or something like that).
The easiest solution is to do
ggplot(transform(test_data, Discrete=cut(values, seq(0,5,1), include.lowest=T),...
Now your data.frame
will include a column of factors based on the column values
, so you can do aes(..., color=Discrete,...)
JUST in the context of your ggplot
. The format of test_data
will be preserved once you are done plotting.
To keep a discrete column, of course, your best option is:
test_data$Discrete <- cut(values, seq(0,5,1), include.lowest=T)
You can switch from the colour bar legend to the discrete
-style legend.
library(RColorBrewer) # for brewer.pal
ggplot(test_data, aes(x = var1, y = var2, colour = value)) +
geom_jitter(position = position_jitter(width = 0.1, heigh = 0.1)) +
scale_colour_gradientn(guide = 'legend', colours = brewer.pal(n = 5, name = 'Set1'))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With