Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting different results using ggplot and base plot functions

I have a table available here: http://ulozto.cz/xAeP3Ahn/res2-txt. I am trying to make a point plot from that.

I read my table:

res2<-read.table("res2.txt", header = TRUE, sep="\t")

and create 2 plots.

(1) This is the script for single plot function:

plot(res2$V2, res2$dist06, type = "n")
points(subset(res2$V2, year == 2006), subset(res2$dist06, year == 2006), pch = 19, col   = "red", cex = 1)
points(subset(res2$V2, year == 2007), subset(res2$dist06, year == 2007), pch = 19, col = "green", cex = 1)
points(subset(res2$V2, year == 2008), subset(res2$dist06, year == 2008), pch = 19, col = "black", cex = 1)
points(subset(res2$V2, year == 2009), subset(res2$dist06, year == 2009), pch = 19, col = "blue", cex = 1)
points(subset(res2$V2, year == 2011), subset(res2$dist06, year == 2011), pch = 19, col = "yellow", cex = 1)
legend("topright", c("2006", "2007", "2008", "2009", "2011"),
   col= c("red", "green", "black", "blue", "yellow"),
   pch = c(19,19,19,19,19))

(2) and for ggplot2:

res2$year<-as.factor(res2$year)  # consider year variable as discrete
ggplot(data=res2, aes(x=V2, y=dist06, color=year)) + geom_point(shape=16, pch=50) +
    xlab("threshold") + ylab("Euclidean distance") + 
    scale_fill_hue(name="year") + # set legend title 
    scale_colour_manual(values=c("red", "green", "black", "blue", "yellow")) +
    theme_bw() 

Here are my results:

results from simple plot function (1) and froom ggplot2 (2)

My question is, why I have a different points position in plots generated differently? is the problem only in different colors and legend? so are the "subsets" defined wrong? Why 2006 is marked as red in both but has a different position in graph? the same with 2011 and others? Where am I wrong? Thanks for every recommendations, I am lost here third day.

Here is my resuls from excel, so the plot from ggplot2 (2) has to be right plot from same data in excel

like image 954
maycca Avatar asked Sep 30 '22 13:09

maycca


1 Answers

I suppose this is a side-effect of an incorrect usage of subset. The first argument to it should be the whole data frame, like so:

subset(res2, year == 2006)$V2

or

subset(res2, year == 2006, select = V2)

(Side note: objects returned by these commands are different, but both will work for your plot)

I'd recommend using a bracket notation:

res2$V2[res2$year == 2006]

Either way, you'll get a correct plot:

enter image description here

As you may have noticed, you do not have to copy/paste a lot with ggplot approach. Nice!

like image 133
tonytonov Avatar answered Oct 02 '22 16:10

tonytonov