Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I define color groups based on numerical threshold values for ggplot2 scatterplot

I have a data set that contains 2 variables x = event number & y = assay amplitude. I am trying to create a scatterplot in ggplot2 where all of the points that are > 3000 are colored in one color and all of the points < 3000 are in a different color.

I can get the plot and change the color for all data points, but can't figure out how to define a color scheme based on the value threshold.

Here is a sample of the data I'm using:

dat <- data.frame(x=c(399, 16022, 14756, 2609, 1131, 12135, 
                                 7097, 12438, 12604, 14912, 11042, 
                                 14024, 7033, 4971, 15533, 4507, 4627, 
                                 12600, 7458, 14557, 3999, 3154, 6073),
                  y=c(3063.40137, 3687.42041, 3911.856, 
                                    4070.91748, 4089.99561, 4095.50317,
                                    4159.899, 4173.117, 4177.78955, 
                                    4186.46875, 4201.874, 4272.022, 
                                    638.615, 649.8995, 668.8346,
                                    688.754639, 711.92, 712.689636, 
                                    721.1352, 737.841, 741.0727, 
                                    755.2549, 756.730652))
like image 811
user3502134 Avatar asked Apr 05 '14 21:04

user3502134


People also ask

How do I specify colors in ggplot2?

When creating graphs with the ggplot2 R package, colors can be specified either by name (e.g.: “red”) or by hexadecimal code (e.g. : “#FF1234”). It is also possible to use pre-made color palettes available in different R packages, such as: viridis, RColorBrewer and ggsci packages.

How do you set a specific color in R?

In R, colors can be specified either by name (e.g col = “red”) or as a hexadecimal RGB triplet (such as col = “#FFCC00”). You can also use other color systems such as ones taken from the RColorBrewer package.

How do I change the color of the dots on a scatter plot in R?

The different color systems available in R have been described in detail here. To change scatter plot color according to the group, you have to specify the name of the data column containing the groups using the argument groupName . Use the argument groupColors , to specify colors by hexadecimal code or by name .

Is Geom_point a scatter plot?

The point geom is used to create scatterplots. The scatterplot is most useful for displaying the relationship between two continuous variables.


2 Answers

You really just need to do a new indicator variable for this. As @hrbrmstr says, cut is a good general way to do this (works for as many cutpoints as you want).

dat$col <- cut(dat$y,
               breaks = c(-Inf, 3000, Inf),
               labels = c("<=3000", ">3000"))

ggplot(dat, aes(x = x, y = y, color = col)) +
  geom_point()
like image 71
Gregor Thomas Avatar answered Sep 16 '22 18:09

Gregor Thomas


This can be done on the fly with an ifelse statement and without having to create an extra column in the dataset:

ggplot(dat, aes(x = x, y = y)) +
geom_point(aes(color = ifelse(y>3000, 'red', 'blue'))) +
scale_colour_manual(labels = c("<3000", ">3000"), values=c('blue', 'red')) + 
labs(color = "Values")
like image 26
alp Avatar answered Sep 19 '22 18:09

alp