I want the count and proportion (of all of elements) of each group in a data frame (after filtering). This code produces the desired output:
library(dplyr)
df <- data_frame(id = sample(letters[1:3], 100, replace = TRUE),
value = rnorm(100))
summary <- filter(df, value > 0) %>%
group_by(id) %>%
summarize(count = n()) %>%
ungroup() %>%
mutate(proportion = count / sum(count))
> summary
# A tibble: 3 x 3
id count proportion
<chr> <int> <dbl>
1 a 17 0.3695652
2 b 13 0.2826087
3 c 16 0.3478261
Is there an elegant solution to avoid the ungroup()
and second summarize()
steps. Something like:
summary <- filter(df, value > 0) %>%
group_by(id) %>%
summarize(count = n(),
proportion = n() / [?TOTAL_ROWS()?])
I couldn't find such a function in the documentation, but I must be missing something obvious. Thanks!
You can use nrow
on .
which refers to the entire data frame piped in:
df %>%
filter(value > 0) %>%
group_by(id) %>%
summarise(count = n(), proportion = count / nrow(.))
# A tibble: 3 x 3
# id count proportion
# <chr> <int> <dbl>
#1 a 14 0.2592593
#2 b 22 0.4074074
#3 c 18 0.3333333
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With