library(data.table)
dat1 <- data.table(id = c(1, 2, 34, 99),
class = c("sports", "", "music, sports", ""),
hobby = c("knitting, music, sports", "", "", "music"))
> dat1
id class hobby
1 1 sports knitting, music, sports
2 2
3 34 music, sports
4 99 music
I have the above dataset, dat1, where each row corresponds to a unique id. For each id, multiple inputs for either class or hobby are separated by a comma.
I would like to exchange the row and column of this dataset so that I get the following:
input class hobby
1 sports 1, 34 1
2 knitting 1
3 music 34 1, 99
In this dataset, each row corresponds to a unique input from dat1. Now the class and hobby columns are storing the corresponding ids from dat1, each separated by a comma.
Is there a quick way to swap the row and columns like this in R?
Here is a data.table solution
library(data.table)
dat1 <- data.table(id = c(1, 2, 34, 99),
class = c("sports", "", "music, sports", ""),
hobby = c("knitting, music, sports", "", "", "music"))
dat1
#> id class hobby
#> 1: 1 sports knitting, music, sports
#> 2: 2
#> 3: 34 music, sports
#> 4: 99 music
# in long format
dt_melted <- melt.data.table(dat1, id.vars = "id", variable.name = "type", value.name = "value")
dt_melted
#> id type value
#> 1: 1 class sports
#> 2: 2 class
#> 3: 34 class music, sports
#> 4: 99 class
#> 5: 1 hobby knitting, music, sports
#> 6: 2 hobby
#> 7: 34 hobby
#> 8: 99 hobby music
# split values by comma
dt_splitted <- dt_melted[, .(input = unlist(data.table::tstrsplit(value, ","))), by = .(id, type)]
dt_splitted
#> id type input
#> 1: 1 class sports
#> 2: 34 class music
#> 3: 34 class sports
#> 4: 1 hobby knitting
#> 5: 1 hobby music
#> 6: 1 hobby sports
#> 7: 99 hobby music
# bring back to desired wide format
dt_casted <- dcast.data.table(dt_splitted,
formula = "input ~ type",
value.var = "id",
fun.aggregate = paste,
collapse = ", ")
dt_casted
#> input class hobby
#> 1: knitting 1
#> 2: music 34 1, 99
#> 3: sports 1, 34 1
# combine ids by class/hobby
dt_splitted[, .(class = paste(id[type == "class"], collapse = ", "),
hobby = paste(id[type == "hobby"], collapse = ", ")),
by = .(input = trimws(input))]
#> input class hobby
#> 1: sports 1, 34 1
#> 2: music 34 1, 99
#> 3: knitting 1
Another data.table option using dcast + melt
dcast(
melt(dat1[, lapply(.SD, strsplit, ", "), id], "id")[
,
.(input = unlist(value)),
.(id, variable)
], input ~ variable,
value.var = "id",
fun = toString
)
which gives
input class hobby
1: knitting 1
2: music 34 1, 99
3: sports 1, 34 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With