I am trying to do a "group by" - style weighted mean in R. With some basic mean the following code (using the plyr package from Hadley) worked well.
ddply(mydf,.(period),mean)
If I use the same approach with weighted.mean i get the following error "'x' and 'w' must have the same length" , which I do not understand because the weighted.mean part works outside ddply.
weighted.mean(mydf$mycol,mydf$myweight) # works just fine
ddply(mydf,.(period),weighted.mean,mydf$mycol,mydf$myweight) # returns the erros described above
ddply(mydf,.(period),weighted.mean(mydf$mycol,mydf$myweight)) # different code same story
I thought of writing a custom function instead of using weighted.mean and then passing it to ddply or even writing something new from scratch with subset. In my case it would be too much work hopefully, but there should by a smarter solution with what´s already there.
thx for any suggestions in advance!
Weighted mean is the average which is determined by finding the sum of the products of weights and the values then dividing this sum by the sum of total weights. If the weights are in proportion then the total sum of the weights should be 1.
Today we will emphasize ddply() which accepts a data. frame, splits it into pieces based on one or more factors, computes on the pieces, then returns the results as a data. frame. For the record, the built-in functions most relevant to ddply() are tapply() and friends.
Use summarise (or summarize):
ddply(iris, "Species", summarise,
wmn = weighted.mean(Sepal.Length, Petal.Length),
mn = mean(Sepal.Length))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With