Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

percentage count by group using dplyr

Tags:

r

dplyr

with a data frame df like below

df <- data.frame(colors = c("red", "blue", "green", "red", "red" , "blue"))

I can find out the count per color using dplyr as follows

df %>%
  group_by(color) %>%
    summarise(count = n())

Instead of count I need to find the percentage count for each color - how to go about it using dplyr ?

like image 456
user3206440 Avatar asked Dec 03 '22 13:12

user3206440


1 Answers

You can either pipe this to a mutate( prop = count / sum(count) ) or directly within summarise with nrow(.). Something like this:

df %>%
  group_by(colors) %>%
  summarise(count = n() / nrow(.) )

or

df %>%
  group_by(colors) %>%
  summarise(count = n() ) %>%
  mutate( prop = count / sum(count) )
like image 56
Romain Francois Avatar answered Dec 17 '22 06:12

Romain Francois