Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to swap the column and row entries in R

library(data.table)
dat1 <- data.table(id = c(1, 2, 34, 99),
           class = c("sports", "", "music, sports", ""),
           hobby = c("knitting, music, sports", "", "", "music"))
> dat1
  id         class                   hobby
1  1        sports knitting, music, sports
2  2                                      
3 34 music, sports                        
4 99                                 music

I have the above dataset, dat1, where each row corresponds to a unique id. For each id, multiple inputs for either class or hobby are separated by a comma.

I would like to exchange the row and column of this dataset so that I get the following:

     input class hobby
1   sports 1, 34     1
2 knitting           1
3    music    34 1, 99

In this dataset, each row corresponds to a unique input from dat1. Now the class and hobby columns are storing the corresponding ids from dat1, each separated by a comma.

Is there a quick way to swap the row and columns like this in R?

like image 470
Adrian Avatar asked Dec 14 '25 12:12

Adrian


2 Answers

Here is a data.table solution

Input

library(data.table)
dat1 <- data.table(id = c(1, 2, 34, 99),
                   class = c("sports", "", "music, sports", ""),
                   hobby = c("knitting, music, sports", "", "", "music"))
dat1
#>    id         class                   hobby
#> 1:  1        sports knitting, music, sports
#> 2:  2                                      
#> 3: 34 music, sports                        
#> 4: 99                                 music

Dataprep

# in long format
dt_melted <- melt.data.table(dat1, id.vars = "id", variable.name = "type", value.name = "value")
dt_melted
#>    id  type                   value
#> 1:  1 class                  sports
#> 2:  2 class                        
#> 3: 34 class           music, sports
#> 4: 99 class                        
#> 5:  1 hobby knitting, music, sports
#> 6:  2 hobby                        
#> 7: 34 hobby                        
#> 8: 99 hobby                   music

# split values by comma
dt_splitted <- dt_melted[, .(input = unlist(data.table::tstrsplit(value, ","))), by = .(id, type)]
dt_splitted
#>    id  type    input
#> 1:  1 class   sports
#> 2: 34 class    music
#> 3: 34 class   sports
#> 4:  1 hobby knitting
#> 5:  1 hobby    music
#> 6:  1 hobby   sports
#> 7: 99 hobby    music

Last Step 1

# bring back to desired wide format
dt_casted <- dcast.data.table(dt_splitted, 
                              formula = "input ~ type",
                              value.var = "id",
                              fun.aggregate = paste, 
                              collapse = ", ")
dt_casted
#>       input class hobby
#> 1: knitting           1
#> 2:    music    34 1, 99
#> 3:   sports 1, 34     1

Last Step 2 (more verbose)

# combine ids by class/hobby
dt_splitted[, .(class = paste(id[type == "class"], collapse = ", "),
                hobby = paste(id[type == "hobby"], collapse = ", ")),
            by = .(input = trimws(input))]
#>       input class hobby
#> 1:   sports 1, 34     1
#> 2:    music    34 1, 99
#> 3: knitting           1
like image 107
mnist Avatar answered Dec 16 '25 06:12

mnist


Another data.table option using dcast + melt

dcast(
  melt(dat1[, lapply(.SD, strsplit, ", "), id], "id")[
    ,
    .(input = unlist(value)),
    .(id, variable)
  ], input ~ variable,
  value.var = "id",
  fun = toString
)

which gives

      input class hobby
1: knitting           1
2:    music    34 1, 99
3:   sports 1, 34     1
like image 31
ThomasIsCoding Avatar answered Dec 16 '25 04:12

ThomasIsCoding



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!