I'm trying to plot three datasets onto the same graph. One dataset should appear on the graph as just a set of unconnected points, whereas the other two should appear as connected data points. I can build the graph using the following code below:
x <- c(1,2,3,4) y <- c(1.1,1.2,1.3,1.4) y2 <- c(2.1,2.2,2.3,2.4) x3 <- c(4,5,6,7) y3 <- c(3.1,3.2,3.3,3.2) p1 <- data.frame(x=x,y=y) p2 <- data.frame(x=x,y=y2) p3 <- data.frame(x=x3,y=y3) plot(x,y,type="o", col="red") points(x3,y3,col="darkgreen",pch=16) points(x,y2,type="o",col="blue")
As shown in the code, there are two sets of points that are plotted with type "o", meaning that the points are connected by a line, where as one set of points is not connected by a line. I was trying to recreate this in ggplot2. I do the following in ggplot2:
zz <- melt(list(p1=p1,p2=p2,p3=p3), id.vars="x") ggplot(zz, aes(x.value, color = L1)) + geom_point() + scale_color_manual("Dataset", values = c("p1" = "darkgreen", "p2" = "blue", "p3" = "red"))
Doing the above, I get the three sets of points in three different colors, yet of course the red and blue points are not connected respectively. If I want to connect the points I can add geom_line() to the command above so that I have the following:
ggplot(zz, aes(x.value, color = L1)) + geom_point() + scale_color_manual("Dataset", values = c("p1" = "darkgreen", "p2" = "blue", "p3" = "red")) + geom_line()
Of course this results in lines connecting all the points, so that all red points are connected to each other, all blue points are connected to each other, and all green points are connected to each other. However, while I want the red and blue points to be connected, I don't want the green points to be connected. Is there a way to do this?
I could do the following (or similar to it):
ggplot(p2, aes(x,y)) + geom_point(color = "blue") + geom_line(color="blue") + geom_point(data=p3, color = "red") + geom_line(data=p3, color="red") + geom_point(data=p1, color = "darkgreen")
With this command, the red dots are connected, the blue are connected, and the green are disconnected. However, I do not want to do this as I want to be able to have all the point colors appear in the legend (and no legend appears in this solution).
This section shows how to use the ggplot2 package to draw a plot based on two different data sets. For this, we have to set the data argument within the ggplot function to NULL. Then, we are specifying two geoms (i.e. geom_point and geom_line) and define the data set we want to use within each of those geoms.
The function geom_point() adds a layer of points to your plot, which creates a scatterplot.
The trick is that each layer can have its own dataset. So you have to subset the data to exclude L1=="p1"
from the data provided to geom_line
:
ggplot(zz, aes(x, y=value, color=L1)) + geom_point() + geom_line(data=zz[zz$L1!="p1", ]) + scale_color_manual("Dataset", values = c("p1" = "darkgreen", "p2" = "blue", "p3" = "red"))
You can feed a different dataset into each geom. So you can pass in a dataset excluding p1 into the geom_line layer. Something like this should work:
ggplot(zz, aes(x, value, color = L1)) + geom_point() + geom_line(data = subset(zz, L1 %in% c("p2", "p3")), aes(group = L1)) + scale_color_manual("Dataset", values = c("p1" = "darkgreen", "p2" = "blue", "p3" = "red"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With