Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

dplyr count number of one specific value of variable

Tags:

r

count

dplyr

Say I have a dataset like this:

id <- c(1, 1, 2, 2, 3, 3)
code <- c("a", "b", "a", "a", "b", "b")
dat <- data.frame(id, code)

I.e.,

    id  code
1   1   a
2   1   b 
3   2   a
4   2   a
5   3   b
6   3   b

Using dplyr, how would I get a count of how many a's there are for each id

i.e.,

   id  countA
1   1   1
2   2   2
3   3   0

I'm trying stuff like this which isn't working,

countA<- dat %>%
group_by(id) %>%
summarise(cip.completed= count(code == "a"))

The above gives me an error, "Error: no applicable method for 'group_by_' applied to an object of class "logical""

Thanks for your help!

like image 616
Jacob Curtis Avatar asked Mar 30 '16 16:03

Jacob Curtis


People also ask

How do you count occurrences in R dplyr?

Method 2: groupby using dplyrgroup_by() function along with n() is used to count the number of occurrences of the group in R. group_by() function takes “State” and “Name” column as argument and groups by these two columns and summarise() uses n() function to find count of a sales.

How do I count a specific number in R?

We can use the length() function combined with double brackets to count the number of elements in a specific component of the list.

How do I count the number of times a value appears in a column in R?

To count the number of times a value occurs in a column of an R data frame, we can use table function for that particular column.

What does count N () do in R?

count() lets you quickly count the unique values of one or more variables: df %>% count(a, b) is roughly equivalent to df %>% group_by(a, b) %>% summarise(n = n()) . count() is paired with tally() , a lower-level helper that is equivalent to df %>% summarise(n = n()) .


1 Answers

Try the following instead:

library(dplyr)
dat %>% group_by(id) %>%
  summarise(cip.completed= sum(code == "a"))

Source: local data frame [3 x 2]
    id cip.completed
  (dbl)         (int)
1     1             1
2     2             2
3     3             0

This works because the logical condition code == a is just a series of zeros and ones, and the sum of this series is the number of occurences.

Note that you would not necessarily use dplyr::count inside summarise anyway, as it is a wrapper for summarise calling either n() or sum() itself. See ?dplyr::count. If you really want to use count, I guess you could do that by first filtering the dataset to only retain all rows in which code==a, and using count would then give you all strictly positive (i.e. non-zero) counts. For instance,

dat %>% filter(code==a) %>% count(id)

Source: local data frame [2 x 2]

     id     n
  (dbl) (int)
1     1     1
2     2     2
like image 147
coffeinjunky Avatar answered Sep 26 '22 06:09

coffeinjunky