Could someone please point to how we can apply multiple functions to the same column using tapply (or any other method, plyr, etc) so that the result can be obtained in distinct columns). For eg., if I have a dataframe with
User MoneySpent
Joe 20
Ron 10
Joe 30
...
I want to get the result as sum of MoneySpent + number of Occurences.
I used a function like --
f <- function(x) c(sum(x), length(x))
tapply(df$MoneySpent, df$Uer, f)
But this does not split it into columns, gives something like say,
Joe Joe 100, 5 # The sum=100, number of occurrences = 5, but it gets juxtaposed
Thanks in advance,
Raj
You can certainly do stuff like this using ddply
from the plyr
package:
dat <- data.frame(x = rep(letters[1:3],3),y = 1:9)
ddply(dat,.(x),summarise,total = NROW(piece), count = sum(y))
x total count
1 a 3 12
2 b 3 15
3 c 3 18
You can keep listing more summary functions, beyond just two, if you like. Note I'm being a little tricky here in calling NROW
on an internal variable in ddply
called piece
. You could have just done something like length(y)
instead. (And probably should; referencing the internal variable piece
isn't guaranteed to work in future versions, I think. Do as I say, not as I do and just use length()
.)
ddply()
is conceptually the clearest, but sometimes it is useful to use tapply
instead for speed reasons, in which case the following works:
do.call( rbind, tapply(df$MoneySpent, df$User, f) )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With