I have a data frame of observations that looks like this (showing course numbers of college classes offered each term). The columns are very long and of varying lengths
spring summer fall
4a 5b 5c
4a 9c 11b
7c 5b 8a
... ... ...
I want to reformat it to make it look like this. First, I want to create a column, "Course_Names", that shows all names of distinct course offerings possible. Then, I want to count the number of sections of each course offered each semester.
Course_Names spring summer fall
4a 2 0 0
5b 0 2 0
5c 0 0 1
7c 1 0 0
8a 1 0 1
9c 0 1 0
11b 0 0 1
Any advice or links to relevant posts would be very much appreciated! Thank you!
In base R, an option would be to stack the data.frame into a two column dataset and use table
table(stack(df1))
# ind
#values spring summer fall
# 11b 0 0 1
# 4a 2 0 0
# 5b 0 2 0
# 5c 0 0 1
# 7c 1 0 0
# 8a 0 0 1
# 9c 0 1 0
Or in tidyverse, we can reshape into 'long' format with pivot_longer, get the count and reshape into 'wide
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(everything()) %>%
count(name, Course_Names = value) %>%
pivot_wider(names_from = name, values_from = n, values_fill = list(n = 0))
# A tibble: 7 x 4
# Course_Names fall spring summer
# <chr> <int> <int> <int>
#1 11b 1 0 0
#2 5c 1 0 0
#3 8a 1 0 0
#4 4a 0 2 0
#5 7c 0 1 0
#6 5b 0 0 2
#7 9c 0 0 1
df1 <- structure(list(spring = c("4a", "4a", "7c"), summer = c("5b",
"9c", "5b"), fall = c("5c", "11b", "8a")), class = "data.frame", row.names = c(NA,
-3L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With