Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: using ddply with a function

Tags:

r

plyr

I'm trying to use the "ddply" function in conjunction with the "summarize" function, but I'm having difficulty.

Below is an extract of my code:

  orderSubsConsolidate = ddply(merged, .(RIC,leg),summarize,fill.Quant = sum(fill.Quant),
                 fill.Price = function(merged){sum(merged[,7]*merged[,8])/sum(merged[,7})

"merged" is the matrix containing the information that I would like to summarize. I am summarizing by columns "RIC" and "leg". The problem I am having is applying a function to the fill.Price column.

This is an extract from the "merged" matrix:

Trade      RIC    leg Basket.Name Status  Order.Msg   fill.Quant  fill.Price
  ATNATNP ATNJ.J  1   ATNATNP1a1  Filled               100           200       
  ATNATNP ATNPp.J 2   ATNATNP2a1  Filled               100           200       
  ATNATNP ATNJ.J  1   ATNATNP1b1                       300           400

Essentially, what the code above is trying to do is aggregate the fill.Quant column by RIC and leg, and then populate the corresponding fill.Price column with [(fill.Price*fill.Quant)/fill.Quant], resulting in a matrix as given below:

RIC      leg  fill.Quant  fill.Price
ATNJ.J    1      400            350
ATNPp.J   2      100            350  

Any help would be greatly appreciated. Let me know if anything is unclear.

Thanks!

Mike

like image 272
Mike Avatar asked Dec 04 '13 10:12

Mike


People also ask

What does ddply function do in R?

ddply: Split data frame, apply function, and return results in a data frame.

What is the input type of Ddply function?

The plyr package has a variety of _ply functions in which the first two letters refer to the input and output so that ddply takes a dataframe input and produces a dataframe output, and dlply takes a dataframe input and produces a list output.

How do you summarize in R?

The summarize() function is used in the R program to summarize the data frame into just one value or vector. This summarization is done through grouping observations by using categorical values at first, using the groupby() function. The dplyr package is used to get the summary of the dataset.


2 Answers

It looks like it should be

orderSubsConsolidate = ddply(merged, .(RIC,leg), summarize,
                       fill.Quant = sum(fill.Quant),
                       fill.Price = weighted.mean(fill.Price, fill.Quant))
like image 112
Stephen Henderson Avatar answered Oct 01 '22 23:10

Stephen Henderson


You can also use an anonymous function:

ddply(merged, .(RIC,leg), function(x) 
                           data.frame( fill.Quant = sum(x$fill.Quant), 
                                       fill.Price = sum(x[,7]*x[,8])/sum(x[,7])))
like image 40
user1317221_G Avatar answered Oct 02 '22 01:10

user1317221_G