group by in R, ddply with weighted.mean

Tags:

group-by

I am trying to do a "group by" - style weighted mean in R. With some basic mean the following code (using the plyr package from Hadley) worked well.

ddply(mydf,.(period),mean)

If I use the same approach with weighted.mean i get the following error "'x' and 'w' must have the same length" , which I do not understand because the weighted.mean part works outside ddply.

weighted.mean(mydf$mycol,mydf$myweight) # works just fine
ddply(mydf,.(period),weighted.mean,mydf$mycol,mydf$myweight) # returns the erros described above
ddply(mydf,.(period),weighted.mean(mydf$mycol,mydf$myweight)) # different code same story

I thought of writing a custom function instead of using weighted.mean and then passing it to ddply or even writing something new from scratch with subset. In my case it would be too much work hopefully, but there should by a smarter solution with what´s already there.

thx for any suggestions in advance!

433

asked Jul 18 '10 21:07

Matt Bannert

1 Answers

Use summarise (or summarize):

ddply(iris, "Species", summarise, 
  wmn = weighted.mean(Sepal.Length, Petal.Length),
  mn = mean(Sepal.Length))

169

answered Oct 21 '22 03:10

hadley

Related questions
                            
                                Getting error "In function ‘igraph_write_graph_graphml’:" while installing igraph package in R
                            
                                Difference between dplyr::rename and dplyr::rename_all
                            
                                Reasons that ggplot2 legend does not appear [duplicate]
                            
                                Efficient random number generation from a truncated normal distribution
                            
                                ggplot2 find number of counts in histogram maximum
                            
                                Generate an incrementally increasing sequence like 112123123412345
                            
                                how to change vertical position of ggplot title without altering axis label justification
                            
                                remove or find NaN in R
                            
                                How to access map generated by leaflet in R
                            
                                Cache expensive operations in R
                            
                                How do I make a dummy variable in R?
                            
                                Filtering data in a dataframe based on criteria
                            
                                Compute monthly averages from daily data
                            
                                Rbind two vectors in R
                            
                                adding RMySQL package to R fails (on Windows)?
                            
                                Troubles installing "rgl" on Ubuntu
                            
                                Give name to list variable
                            
                                All the connections are in use: Execution halted
                            
                                How to create ascii-only tables as output in R, similar to MySQL style?
                            
                                Remove an element from a list that contains only NA?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With