I am a rather novice user of R and have come to appreciate the elegance of ggplot2 and plyr. Right now, I am trying to analyze a large dataset that I can not share here, but I have reconstructed my problem with the diamonds dataset (shortened for convenience). Without further ado:
diam <- diamonds[diamonds$cut=="Fair"|diamonds$cut=="Ideal",]
boxplots <- ggplot(diam, aes(x=cut, price)) + geom_boxplot(aes(fill=cut)) + facet_wrap(~ color)
print(boxplots)
What the plot produces is a set of boxplots, comparing the price of the two cuts "Fair" and "Ideal".
I would now very much like to proceed by statistically comparing the two cuts for each color subgroup (D,E,F,..,J) using either t.test or wilcox.test.
How would I implement this in an way that is as elegant as the ggplot2-syntax? I assume I would use ddply from the plyr-package, but I couldn't figure out how to feed two subgroups into a function that calculates the appropriate statistics..
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With