this is my first question on SO, I hope someone can help me answer it.
I'm reading data from a csv with R
with data<-read.csv("/data.csv")
and get something like:
Group x y size Color
Medium 1 2 2000 yellow
Small -1 2 1000 red
Large 2 -1 4000 green
Other -1 -1 2500 blue
Each group color may vary, they are assigned by a formula when the csv
file is generated, but those are all the possible colors (the number of groups may also vary).
I've been trying to use ggplot()
like so:
data<-read.csv("data.csv")
xlim<-max(c(abs(min(data$x)),abs(max(data$x))))
ylim<-max(c(abs(min(data$y)),abs(max(data$y))))
data$Color<-as.character(data$Color)
print(data)
ggplot(data, aes(x = x, y = y, label = Group)) +
geom_point(aes(size = size, colour = Group), show.legend = TRUE) +
scale_color_manual(values=c(data$Color)) +
geom_text(size = 4) +
scale_size(range = c(5,15)) +
scale_x_continuous(name="x", limits=c(xlim*-1-1,xlim+1))+
scale_y_continuous(name="y", limits=c(ylim*-1-1,ylim+1))+
theme_bw()
Everything is correct except for the colors
I noticed the legend at the right orders the Groups alphabetically (Large, Medium, Other, Small), but the colors stay in the csv
file order.
Here is a screenshot of the plot.
Can anyone tell me what's missing in my code to fix this? other approaches to achieve the same result are welcome.
In R, colors can be specified either by name (e.g col = “red”) or as a hexadecimal RGB triplet (such as col = “#FFCC00”). You can also use other color systems such as ones taken from the RColorBrewer package.
To color the points in a scatterplot using ggplot2, we can use colour argument inside geom_point with aes. The color can be passed in multiple ways, one such way is to name the particular color and the other way is to giving a range or using a variable.
The color attribute is only used for point, line and scatter chart, fill is generally used for bar, column chart, etc. Color adds color to the border to plot whereas fill is to color inside bar/column, etc.
One way to do this, as suggested by help("scale_colour_manual")
is to use a named character vector:
col <- as.character(data$Color)
names(col) <- as.character(data$Group)
And then map the values
argument of the scale to this vector
# just showing the relevant line
scale_color_manual(values=col) +
full code
xlim<-max(c(abs(min(data$x)),abs(max(data$x))))
ylim<-max(c(abs(min(data$y)),abs(max(data$y))))
col <- as.character(data$Color)
names(col) <- as.character(data$Group)
ggplot(data, aes(x = x, y = y, label = Group)) +
geom_point(aes(size = size, colour = Group), show.legend = TRUE) +
scale_color_manual(values=col) +
geom_text(size = 4) +
scale_size(range = c(5,15)) +
scale_x_continuous(name="x", limits=c(xlim*-1-1,xlim+1))+
scale_y_continuous(name="y", limits=c(ylim*-1-1,ylim+1))+
theme_bw()
Ouput:
Data
data <- read.table("Group x y size Color
Medium 1 2 2000 yellow
Small -1 2 1000 red
Large 2 -1 4000 green
Other -1 -1 2500 blue",head=TRUE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With