Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting simple ggplot2 code to use data.table

My old code looked like this:

library(ggplot2)
gp<-ggplot(NULL,aes(x=Income))
gp<-gp+geom_density(data=dat$Male,color="blue")
gp<-gp+geom_density(data=dat$Female,color="green")
gp<-gp+geom_density(data=dat$Alien,color="red")
plot(gp) #Works

Now I have started using the excellent data.table library (instead of data.frame):

library(data.table)
cols<-c("blue","green","red")
gp<-ggplot(NULL,aes(x=Income))
dat[, list(gp+geom_density(data=.SD, color=cols[.GRP])), by=Gender]
#I even tried
dat[, list(gp<-gp+geom_density(data=.SD, color=cols[.GRP])), by=Gender]
plot(gp) #Error: No layers in plot

I am not exactly sure what is wrong, but it seems that the code I run inside J() is not being recognised in the outer scope.

How can I achieve this in an data.table idiomatic way?

like image 690
Kostolma Avatar asked Mar 20 '13 15:03

Kostolma


1 Answers

ggplot2 should be used with long format data.tables in the same way as with long format data.frames:

library(data.table)
set.seed(42)

dat <- rbind(data.table(gender="male",value=rnorm(1e4)),
             data.table(gender="female",value=rnorm(1e4,2,1))
             )

library(ggplot2)
p1 <- ggplot(dat,aes(x=value,color=gender)) + geom_density()
print(p1)

Don't feed wide format data.frames (or data.tables) to ggplot2.

Plotting will be quite slow if you have many groups, but due to the internal magic of ggplot2 that's nothing data.table can really help with (until Hadley implements it somehow). You can try to calulate the densities outside ggplot2, but that will only help you so far:

set.seed(42)
dat2 <- data.table(gender=as.factor(1:5000),value=rnorm(1e7))
plotdat <- dat2[,list(x_den=density(value)$x,y_den=density(value)$y),by=gender]
p2 <- ggplot(plotdat,aes(x=x_den,y=y_den,color=gender)) + geom_line()
print(p2) #this needs some CPU time

Of course, if you have many groups you probably do the wrong kind of plot.

like image 122
Roland Avatar answered Oct 06 '22 22:10

Roland