Creating column that lists distinct observations

Question

I have a data frame of observations that looks like this (showing course numbers of college classes offered each term). The columns are very long and of varying lengths

  spring   summer   fall
   4a       5b       5c
   4a       9c       11b
   7c       5b       8a 
   ...      ...      ...

I want to reformat it to make it look like this. First, I want to create a column, "Course_Names", that shows all names of distinct course offerings possible. Then, I want to count the number of sections of each course offered each semester.

   Course_Names   spring   summer   fall
   4a             2        0        0
   5b             0        2        0
   5c             0        0        1
   7c             1        0        0
   8a             1        0        1
   9c             0        1        0
   11b            0        0        1

Any advice or links to relevant posts would be very much appreciated! Thank you!

akrun · Accepted Answer

In base R, an option would be to stack the data.frame into a two column dataset and use table

table(stack(df1))
#    ind
#values spring summer fall
#   11b      0      0    1
#   4a       2      0    0
#   5b       0      2    0
#   5c       0      0    1
#   7c       1      0    0
#   8a       0      0    1
#   9c       0      1    0

Or in tidyverse, we can reshape into 'long' format with pivot_longer, get the count and reshape into 'wide

library(dplyr)
library(tidyr)
df1 %>%
    pivot_longer(everything()) %>%
    count(name, Course_Names = value) %>%
    pivot_wider(names_from = name, values_from = n, values_fill = list(n = 0))
# A tibble: 7 x 4
#  Course_Names  fall spring summer
#  <chr>        <int>  <int>  <int>
#1 11b              1      0      0
#2 5c               1      0      0
#3 8a               1      0      0
#4 4a               0      2      0
#5 7c               0      1      0
#6 5b               0      0      2
#7 9c               0      0      1

data

df1 <- structure(list(spring = c("4a", "4a", "7c"), summer = c("5b", 
"9c", "5b"), fall = c("5c", "11b", "8a")), class = "data.frame", row.names = c(NA, 
-3L))

Creating column that lists distinct observations

Tags:

r

categorical-data

Anna Jones

1 Answers

data

akrun

Recent Activity

Donate For Us

Creating column that lists distinct observations

Tags:

r

categorical-data

Anna Jones

1 Answers

data

akrun

Related questions

Recent Activity

Donate For Us