tidyverse pivot_longer several sets of columns, but avoid intermediate mutate_wider steps [duplicate]

Question

I have the following data

dat <- data.frame(id         = c("A", "B", "C"),
                  Q1r1_pepsi = c(1,0,1),
                  Q1r1_cola  = c(0,0,1),
                  Q1r2_pepsi = c(1,1,1),
                  Q1r2_cola  = c(0,1,1),
                  stringsAsFactors = FALSE)

where Q1r1 and Q1r2 are rating questions in a survey and pepsi and cola are the brands being rated. So I have two ratings (r1 and r2) for two brands (pepsi, cola):

id      Q1r1_c1    Q1r1_c2    Q1r2_c1    Q1r2_c2
"A"     1          0          1          0
"B"     0          0          1          1
"C"     1          1          1          1

(Side question: how do I format a SO post so that it correctly contains the nicely formatted output that I would get when calling dat in the R Console?)

To analyze the data I need to reshape (pivot) the data so that rows indicate unique rating-brand pairs. Thus, the expected outcome would be:

id      brand   Q1r1    Q1r2
"A"     "pepsi" 1       1
"A"     "cola"  0       0
"B"     "pepsi" 0       1
"B"     "cola"  0       1
"C"     "pepsi" 1       1
"C"     "cola"  1       1

Currently, I always do a combination of pivot_longer and pivot_wider, but I was hoping that I can directly get this result by pivoting_longer without doing the intermediate step:

library(tidyverse)

dat_long <- dat %>%
  pivot_longer(cols = starts_with("Q1")) %>%
  separate(name, into = c("item", "brand"), remove = FALSE)

dat_wide <- dat_long %>%
  pivot_wider(id_cols = c(id, brand),
              names_from = item,
              values_from = value)

With this current example it's still ok to do this intermediate step, but it gets tiresome in other less clean examples, e.g. suppose my columns weren't named in a nice structure with Q1r1_c1, Q1r1_c2, Q1r2_c1, Q1r2_c2, but instead would be Q4, Q5, Q8r1, Q8r2 where the map would be between Q4 and Q8r1, and Q5/Q8r2, respectively.

Ronak Shah · Accepted Answer

You can use :

tidyr::pivot_longer(dat, cols = -id, 
                   names_to = c('.value', 'brand'), 
                   names_sep = "_")


#  id    brand  Q1r1  Q1r2
#  <chr> <chr> <dbl> <dbl>
#1 A     pepsi     1     1
#2 A     cola      0     0
#3 B     pepsi     0     1
#4 B     cola      0     1
#5 C     pepsi     1     1
#6 C     cola      1     1

deschen · Answer

Following @Ronak Shah's suggestion, I'm pasting my code here in case my column names are less structured. I'm building up this code on the dat data I specified in my initial post.

names(dat) <- c("id", "Q4", "Q5", "Q8r1", "Q8r2")

spec <- data.frame(.name  = names(dat)[-1],
                   .value = c("Q4Q5", "Q4Q5", "Q8", "Q8"),
                   brand  = rep(c("pepsi", "cola"), 2),
                   stringsAsFactors = FALSE)

dat_long <- pivot_longer_spec(dat, spec)

This gives essentially the same result as for the data with my structured names, just that the names are different now.

Again, I'm not entirely sure how generalizable this approach is, but it worked in my case.

tidyverse pivot_longer several sets of columns, but avoid intermediate mutate_wider steps [duplicate]

Tags:

r

pivot

tidyr

deschen

2 Answers

Ronak Shah

deschen

Recent Activity

Donate For Us

tidyverse pivot_longer several sets of columns, but avoid intermediate mutate_wider steps [duplicate]

Tags:

r

pivot

tidyr

deschen

2 Answers

Ronak Shah

deschen

Related questions

Recent Activity

Donate For Us