Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Assign point color depending on data.frame column value R

Tags:

r

ggplot2

this is my first question on SO, I hope someone can help me answer it.

I'm reading data from a csv with R with data<-read.csv("/data.csv") and get something like:

Group    x   y  size    Color
Medium   1   2  2000    yellow
Small   -1   2  1000    red
Large    2  -1  4000    green
Other   -1  -1  2500    blue

Each group color may vary, they are assigned by a formula when the csv file is generated, but those are all the possible colors (the number of groups may also vary).

I've been trying to use ggplot() like so:

data<-read.csv("data.csv")
xlim<-max(c(abs(min(data$x)),abs(max(data$x))))
ylim<-max(c(abs(min(data$y)),abs(max(data$y))))
data$Color<-as.character(data$Color)
print(data)
ggplot(data, aes(x = x, y = y, label = Group)) +
geom_point(aes(size = size, colour = Group), show.legend = TRUE) +
scale_color_manual(values=c(data$Color)) +
geom_text(size = 4) +
scale_size(range = c(5,15)) +
scale_x_continuous(name="x", limits=c(xlim*-1-1,xlim+1))+
scale_y_continuous(name="y", limits=c(ylim*-1-1,ylim+1))+
theme_bw()

Everything is correct except for the colors

  • small is drawn blue
  • Medium is drawn red
  • Other is drawn green
  • Large is drawn yellow

I noticed the legend at the right orders the Groups alphabetically (Large, Medium, Other, Small), but the colors stay in the csv file order.

Here is a screenshot of the plot.

enter image description here

Can anyone tell me what's missing in my code to fix this? other approaches to achieve the same result are welcome.

like image 944
gantonioid Avatar asked Feb 08 '16 21:02

gantonioid


People also ask

How do I assign a color to a variable in R?

In R, colors can be specified either by name (e.g col = “red”) or as a hexadecimal RGB triplet (such as col = “#FFCC00”). You can also use other color systems such as ones taken from the RColorBrewer package.

How do you change the color of a geom point in R?

To color the points in a scatterplot using ggplot2, we can use colour argument inside geom_point with aes. The color can be passed in multiple ways, one such way is to name the particular color and the other way is to giving a range or using a variable.

What is the difference between fill and color in R?

The color attribute is only used for point, line and scatter chart, fill is generally used for bar, column chart, etc. Color adds color to the border to plot whereas fill is to color inside bar/column, etc.


1 Answers

One way to do this, as suggested by help("scale_colour_manual") is to use a named character vector:

col <- as.character(data$Color)
names(col) <- as.character(data$Group)

And then map the values argument of the scale to this vector

# just showing the relevant line
scale_color_manual(values=col) +

full code

xlim<-max(c(abs(min(data$x)),abs(max(data$x))))
ylim<-max(c(abs(min(data$y)),abs(max(data$y))))

col <- as.character(data$Color)
names(col) <- as.character(data$Group)

ggplot(data, aes(x = x, y = y, label = Group)) +
  geom_point(aes(size = size, colour = Group), show.legend = TRUE) +
  scale_color_manual(values=col) +
  geom_text(size = 4) +
  scale_size(range = c(5,15)) +
  scale_x_continuous(name="x", limits=c(xlim*-1-1,xlim+1))+
  scale_y_continuous(name="y", limits=c(ylim*-1-1,ylim+1))+
  theme_bw()

Ouput:

enter image description here

Data

data <- read.table("Group    x   y  size    Color
Medium   1   2  2000    yellow
Small   -1   2  1000    red
Large    2  -1  4000    green
Other   -1  -1  2500    blue",head=TRUE)
like image 136
scoa Avatar answered Oct 16 '22 18:10

scoa