I have this data frame. I'd like to aggregate the data so that one column shows total launches and the next shows total failed launches.
state_name launch_year category
1 United States 1958 Success
2 United States 1958 Success
3 United States 1958 Success
4 United States 1958 Failure
5 United States 1958 Failure
6 United States 1958 Failure
7 Soviet Union 1957 Success
8 Soviet Union 1957 Success
9 Soviet Union 1958 Success
10 Soviet Union 1959 Success
11 Soviet Union 1959 Success
12 Soviet Union 1959 Success
13 Soviet Union 1958 Failure
14 Soviet Union 1958 Failure
15 Soviet Union 1958 Failure
16 Soviet Union 1958 Failure
17 Soviet Union 1959 Failure
18 United States 1959 Success
19 United States 1959 Failure
20 United States 1958 Success
21 United States 1959 Success
22 United States 1959 Failure
23 United States 1958 Success
24 United States 1958 Success
25 United States 1959 Success
26 United States 1959 Success
27 United States 1959 Success
28 United States 1959 Success
29 United States 1959 Success
30 United States 1959 Success
31 United States 1959 Success
32 United States 1958 Failure
33 United States 1958 Failure
34 United States 1959 Failure
35 United States 1959 Failure
36 United States 1959 Failure
37 United States 1958 Success
38 United States 1959 Success
39 United States 1959 Success
40 United States 1957 Failure
41 United States 1958 Failure
42 United States 1958 Failure
43 United States 1958 Failure
44 United States 1958 Failure
45 United States 1958 Failure
46 United States 1958 Failure
47 United States 1958 Failure
48 United States 1958 Failure
49 United States 1958 Failure
50 United States 1958 Failure
51 United States 1959 Failure
52 United States 1959 Failure
Each row represents a launch. The category is the outcome of the launch.
I'd like to turn it into something like this.
state_name launch_year launches failed_launches
1 United States 1957 1 1
2 Soviet Union 1957 2 0
3 United States 1958 22 15
4 Soviet Union 1958 5 4
5 United States 1959 4 3
6 Soviet Union 1959 18 1
I've tried filtering to just the failed launches and then adding a failed_launch
column, but I don't know how to get back to the rest of the data from there.
launches %>%
filter(category == "Failure") %>%
count(state_name, launch_year) %>%
mutate(failed_launches = n)
Could do:
df %>%
group_by(state_name, launch_year) %>%
summarise(
launches = n(),
failed_launches = sum(category == "Failure")
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With