Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the ggplot2/plyr way to calculate statistical tests between two subgroups?

Tags:

I am a rather novice user of R and have come to appreciate the elegance of ggplot2 and plyr. Right now, I am trying to analyze a large dataset that I can not share here, but I have reconstructed my problem with the diamonds dataset (shortened for convenience). Without further ado:

diam <- diamonds[diamonds$cut=="Fair"|diamonds$cut=="Ideal",]
boxplots <- ggplot(diam, aes(x=cut, price)) + geom_boxplot(aes(fill=cut)) + facet_wrap(~ color)
print(boxplots)

What the plot produces is a set of boxplots, comparing the price of the two cuts "Fair" and "Ideal".

I would now very much like to proceed by statistically comparing the two cuts for each color subgroup (D,E,F,..,J) using either t.test or wilcox.test.

How would I implement this in an way that is as elegant as the ggplot2-syntax? I assume I would use ddply from the plyr-package, but I couldn't figure out how to feed two subgroups into a function that calculates the appropriate statistics..