Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: How can I do a conditional count in dplyr? [duplicate]

Tags:

r

dplyr

I have this data frame. I'd like to aggregate the data so that one column shows total launches and the next shows total failed launches.

      state_name launch_year category
1  United States        1958  Success
2  United States        1958  Success
3  United States        1958  Success
4  United States        1958  Failure
5  United States        1958  Failure
6  United States        1958  Failure
7   Soviet Union        1957  Success
8   Soviet Union        1957  Success
9   Soviet Union        1958  Success
10  Soviet Union        1959  Success
11  Soviet Union        1959  Success
12  Soviet Union        1959  Success
13  Soviet Union        1958  Failure
14  Soviet Union        1958  Failure
15  Soviet Union        1958  Failure
16  Soviet Union        1958  Failure
17  Soviet Union        1959  Failure
18 United States        1959  Success
19 United States        1959  Failure
20 United States        1958  Success
21 United States        1959  Success
22 United States        1959  Failure
23 United States        1958  Success
24 United States        1958  Success
25 United States        1959  Success
26 United States        1959  Success
27 United States        1959  Success
28 United States        1959  Success
29 United States        1959  Success
30 United States        1959  Success
31 United States        1959  Success
32 United States        1958  Failure
33 United States        1958  Failure
34 United States        1959  Failure
35 United States        1959  Failure
36 United States        1959  Failure
37 United States        1958  Success
38 United States        1959  Success
39 United States        1959  Success
40 United States        1957  Failure
41 United States        1958  Failure
42 United States        1958  Failure
43 United States        1958  Failure
44 United States        1958  Failure
45 United States        1958  Failure
46 United States        1958  Failure
47 United States        1958  Failure
48 United States        1958  Failure
49 United States        1958  Failure
50 United States        1958  Failure
51 United States        1959  Failure
52 United States        1959  Failure

Each row represents a launch. The category is the outcome of the launch.

I'd like to turn it into something like this.

      state_name launch_year launches  failed_launches
1  United States        1957  1          1
2  Soviet Union         1957  2          0
3  United States        1958  22         15
4  Soviet Union         1958  5          4
5  United States        1959  4          3
6  Soviet Union         1959  18         1

I've tried filtering to just the failed launches and then adding a failed_launch column, but I don't know how to get back to the rest of the data from there.

launches %>% 
  filter(category == "Failure") %>%
  count(state_name, launch_year) %>%
  mutate(failed_launches = n)
like image 291
Sebastian Avatar asked Feb 09 '19 23:02

Sebastian


1 Answers

Could do:

df %>%
  group_by(state_name, launch_year) %>%
  summarise(
    launches = n(),
    failed_launches = sum(category == "Failure")
  )
like image 185
arg0naut91 Avatar answered Nov 03 '22 00:11

arg0naut91