Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Violin Plot (geom_violin) with aggregated values

Tags:

r

ggplot2

I would like to create violin plots with aggregated data. My data has a category, a value coloumn and a count coloumn:

data <- data.frame(category = rep(LETTERS[1:3],3),
                   value = c(1,1,1,2,2,2,3,3,3),
                   count = c(3,2,1,1,2,3,2,1,3))

If I create a simple violin plot it looks like this:

plot <- ggplot(data, aes(x = category, y = value)) + geom_violin()
plot


(source: ahschulz.de)

That is not what I wanted. A solution would be to reshape the dataframe by multiplying the rows of each category-value combination. The problem is that my counts go up to millions which takes hours to be plotted! :-(

Is there a solution with my data?

Thanks in advance!

like image 930
ahs85 Avatar asked May 06 '13 11:05

ahs85


2 Answers

You can submit a weight when calculating the areas.

plot2 <- ggplot(data, aes(x = category, y = value, weight = count)) + geom_violin()
plot2

You will get warning messages that the weights do not add to one, but that is ok. See here for similar/related discussion.

enter image description here

like image 131
Andy W Avatar answered Oct 25 '22 13:10

Andy W


Using stat="identity" and specifying a violinwidth aesthetic appears to work,although I had to put in a fudge factor:

ggplot(data, aes(x = category, y = value)) + 
   geom_violin(stat="identity",aes(violinwidth=0.2*count))
like image 31
Ben Bolker Avatar answered Oct 25 '22 14:10

Ben Bolker