R's ave()
function is way more useful than its name suggests - it's basically a version of tapply()
that lets you return a vector the same length as the input, and slots those values back into the same order as the input for you.
> x <- 1:10
> ave(x, x %% 2, FUN=function(d) d-mean(d))
[1] -4 -4 -2 -2 0 0 2 2 4 4
You can achieve a similar effect with ddply()
, but it requires a couple extra copies of the data, as well as a couple auxiliary variables:
> x <- 1:10
> val <- ddply(data.frame(x=x, id=1:10), .(x %% 2),
function(d) {d$y <- d$x-mean(d$x); d})
> val[order(val$id),]$y
[1] -4 -4 -2 -2 0 0 2 2 4 4
Is there some other plyr
technique that matches the lightweight approach I can get with ave()
?
You can shorten the ddply
code somewhat by using transform
:
ddply(data.frame(x=x, id=1:10), .(x %% 2),transform,y = x - mean(x))
but I don't think that ddply
and other plyr functions are really meant to replicate the functionality of ave
that you describe. For splitting and recombining single atomic vectors, tapply
(and ave
) are probably the right tools for the job.
I recently wrote a blog post comparing ave, ddply, and data.table in terms of speed. I would recommend you take a look at data.table, it might prove beneficial. Sorry in advance if anyone takes offence to the self promotion.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With