I have a data set that contains 2 variables x
= event number & y
= assay amplitude. I am trying to create a scatterplot in ggplot2
where all of the points that are > 3000
are colored in one color and all of the points < 3000
are in a different color.
I can get the plot and change the color for all data points, but can't figure out how to define a color scheme based on the value threshold.
Here is a sample of the data I'm using:
dat <- data.frame(x=c(399, 16022, 14756, 2609, 1131, 12135,
7097, 12438, 12604, 14912, 11042,
14024, 7033, 4971, 15533, 4507, 4627,
12600, 7458, 14557, 3999, 3154, 6073),
y=c(3063.40137, 3687.42041, 3911.856,
4070.91748, 4089.99561, 4095.50317,
4159.899, 4173.117, 4177.78955,
4186.46875, 4201.874, 4272.022,
638.615, 649.8995, 668.8346,
688.754639, 711.92, 712.689636,
721.1352, 737.841, 741.0727,
755.2549, 756.730652))
When creating graphs with the ggplot2 R package, colors can be specified either by name (e.g.: “red”) or by hexadecimal code (e.g. : “#FF1234”). It is also possible to use pre-made color palettes available in different R packages, such as: viridis, RColorBrewer and ggsci packages.
In R, colors can be specified either by name (e.g col = “red”) or as a hexadecimal RGB triplet (such as col = “#FFCC00”). You can also use other color systems such as ones taken from the RColorBrewer package.
The different color systems available in R have been described in detail here. To change scatter plot color according to the group, you have to specify the name of the data column containing the groups using the argument groupName . Use the argument groupColors , to specify colors by hexadecimal code or by name .
The point geom is used to create scatterplots. The scatterplot is most useful for displaying the relationship between two continuous variables.
You really just need to do a new indicator variable for this. As @hrbrmstr says, cut
is a good general way to do this (works for as many cutpoints as you want).
dat$col <- cut(dat$y,
breaks = c(-Inf, 3000, Inf),
labels = c("<=3000", ">3000"))
ggplot(dat, aes(x = x, y = y, color = col)) +
geom_point()
This can be done on the fly with an ifelse
statement and without having to create an extra column in the dataset:
ggplot(dat, aes(x = x, y = y)) +
geom_point(aes(color = ifelse(y>3000, 'red', 'blue'))) +
scale_colour_manual(labels = c("<3000", ">3000"), values=c('blue', 'red')) +
labs(color = "Values")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With