I often have data that is grouped by one or more variables, with several registrations within each group. From the data frame, I wish to select groups according to various criteria.
I commonly use a split-sapply-rbind approach, where I extract elements from a list using a logical vector.
Here is a small example. I start with a data frame with one grouping variable ('group'), and I wish to select groups that have a maximum mass of less than 45:
dd <- data.frame(group = rep(letters[1:3], each = 5),
mass = c(rnorm(5, 30), rnorm(5, 50),
rnorm(5, 40)))
dd2 <- split(x = dd, f = dd$group)
dd3 <- dd2[sapply(dd2, function(x) max(x$mass) < 45)]
dd4 <- do.call(rbind, dd3)
I have just started to use plyr, and now I wonder:
is there a plyr-only alternative to achieve this?
At least in this situation this gives the same result
library(plyr)
dd5 <- ddply(dd,.(group),function(x) x[max(x$mass)<45,])
all(dd4==dd5)
[1] TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With