ok, I'm trying to wrap my head around dplyr, using it instead of plyr. In my short time with R I've grown somewhat accustomed to ddply. I'm using a "simple" example for how to use dplyr as opposed to ddply in plyr. Here goes: in the following:
t1.table <- ddply(diamonds, c("clarity", "cut"), "nrow")
I receive a summary table of counts of diamonds by clarity and cut. In dplyr, the simplest example I can come up with is:
diamonds %>% select(clarity, cut) %>% group_by(clarity, cut) %>%
summarise(count=n()) -> t2.table
which seems a bit more involved. Is there a better way to simplify this? ~ thanks
In the latest version of dplyr you can simplify that down to this:
diamonds %>% count(clarity, cut)
Or if you want to keep the column name 'nrow':
diamonds %>% count(clarity, cut) %>% rename(nrow = n)
If you've got plyr or rename loaded in your environment then you might need to prefix the rename:
diamonds %>% count(clarity, cut) %>% dplyr::rename(nrow = n)
Thanks for the help. I like this answer. Not quite as compact as the original ddply command, but a heck of a lot more readable. (side note: answering a question is a pain, needs work)
t3.table <- diamonds %>% group_by(clarity, cut) %>% summarise(nrow=n())
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With