I have a "wide" data.frame with a series of paired columns. My goal is to unpivot this using pivot_longer. There is lot's of help for paired columns with prefixes in all columns. But in my case only half of the pairs has prefixes and I don't know how to define the "names_pattern", if this is even possible.
In the following example all paired columns have a prefix which I can split with the names_pattern attribute of pivot_longer:
df_wide <- tribble(
~id , ~f_start , ~d_start , ~f_end , ~d_end ,
'A' , 'p' , '2018-01-01' , 'p' , '2018-02-01' ,
'B' , 'i' , '2019-04-01' , 'p' , '2020-01-01' ,
'C' , 'i' , '2018-06-01' , 'i' , '2019-03-01' ,
'D' , 'p' , '2019-12-01' , 'p' , '2020-05-01' ,
'E' , 'p' , '2019-02-01' , 'p' , '2019-05-01' ,
'F' , 'i' , '2018-04-01' , 'i' , '2018-07-01' ,
)
df_wide %>%
pivot_longer(-id ,
names_to = c('.value' , 'status') ,
names_pattern = '(.*)_(.*)'
)
This leads to my expected output:
id status f d
<chr> <chr> <chr> <chr>
1 A start p 2018-01-01
2 A end p 2018-02-01
3 B start i 2019-04-01
4 B end p 2020-01-01
5 C start i 2018-06-01
6 C end i 2019-03-01
7 D start p 2019-12-01
8 D end p 2020-05-01
9 E start p 2019-02-01
10 E end p 2019-05-01
11 F start i 2018-04-01
12 F end i 2018-07-01
Unfortunately my data.frame looks like this with missing prefix in one half of the pairs.
df_wide<- tribble(
~id , ~f_start , ~start , ~f_end , ~end ,
'A' , 'p' , '2018-01-01' , 'p' , '2018-02-01' ,
'B' , 'i' , '2019-04-01' , 'p' , '2020-01-01' ,
'C' , 'i' , '2018-06-01' , 'i' , '2019-03-01' ,
'D' , 'p' , '2019-12-01' , 'p' , '2020-05-01' ,
'E' , 'p' , '2019-02-01' , 'p' , '2019-05-01' ,
'F' , 'i' , '2018-04-01' , 'i' , '2018-07-01' ,
)
Has anyone an idea how to achieve the expected output?
Thanks in advance
If we can rename
the missing prefixes with rename_at
, the OP's solution would work
library(dplyr)
library(tidyr)
library(stringr)
df_wide %>%
rename_at(vars(matches('^(start|end)')), ~ str_c('d_', .)) %>%
pivot_longer(-id ,
names_to = c('.value' , 'status') ,
names_pattern = '(.*)_(.*)'
)
# A tibble: 12 x 4
# id status f d
# <chr> <chr> <chr> <chr>
# 1 A start p 2018-01-01
# 2 A end p 2018-02-01
# 3 B start i 2019-04-01
# 4 B end p 2020-01-01
# 5 C start i 2018-06-01
# 6 C end i 2019-03-01
# 7 D start p 2019-12-01
# 8 D end p 2020-05-01
# 9 E start p 2019-02-01
#10 E end p 2019-05-01
#11 F start i 2018-04-01
#12 F end i 2018-07-01
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With