Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

group by in R, ddply with weighted.mean

Tags:

r

group-by

I am trying to do a "group by" - style weighted mean in R. With some basic mean the following code (using the plyr package from Hadley) worked well.

ddply(mydf,.(period),mean)

If I use the same approach with weighted.mean i get the following error "'x' and 'w' must have the same length" , which I do not understand because the weighted.mean part works outside ddply.

weighted.mean(mydf$mycol,mydf$myweight) # works just fine
ddply(mydf,.(period),weighted.mean,mydf$mycol,mydf$myweight) # returns the erros described above
ddply(mydf,.(period),weighted.mean(mydf$mycol,mydf$myweight)) # different code same story

I thought of writing a custom function instead of using weighted.mean and then passing it to ddply or even writing something new from scratch with subset. In my case it would be too much work hopefully, but there should by a smarter solution with what´s already there.

thx for any suggestions in advance!

like image 433
Matt Bannert Avatar asked Jul 18 '10 21:07

Matt Bannert


People also ask

How do you calculate weighted mean in R?

Weighted mean is the average which is determined by finding the sum of the products of weights and the values then dividing this sum by the sum of total weights. If the weights are in proportion then the total sum of the weights should be 1.

What does Ddply do in R?

Today we will emphasize ddply() which accepts a data. frame, splits it into pieces based on one or more factors, computes on the pieces, then returns the results as a data. frame. For the record, the built-in functions most relevant to ddply() are tapply() and friends.


1 Answers

Use summarise (or summarize):

ddply(iris, "Species", summarise, 
  wmn = weighted.mean(Sepal.Length, Petal.Length),
  mn = mean(Sepal.Length))
like image 169
hadley Avatar answered Oct 21 '22 03:10

hadley