How can I convert my column "payment" from long to wide format while keeping the other columns unchanged?
For each level of "letter", when the cell is before the value of "payment", then when in the wide format this row of the corresponding new variable "e.g., dollar" will have "0"; otherwise "1".
I tried output_format_test<-input_format%>%tidyr::pivot_wider(names_from = age, values_from = payment)
, but it does not produce the intended result.
##Input format
input_format <- readr::read_table2("letter age payment
A 2 NA
A 3 dollar
A 4 NA
D 2 euro
D 3 dollar
D 4 NA
F 2 NA
F 3 euro
F 3 dollar
F 4 NA
F 4 NA")
input_format
# A tibble: 11 x 3
letter age payment
<chr> <dbl> <chr>
1 A 2 NA
2 A 3 dollar
3 A 4 NA
4 D 2 euro
5 D 3 dollar
6 D 4 NA
7 F 2 NA
8 F 3 euro
9 F 3 dollar
10 F 4 NA
11 F 4 NA
##output format
output_format <- readr::read_table2(
"letter age payment dollar euro
A 2 NA 0 0
A 3 dollar 1 0
A 4 NA 1 0
D 2 euro 0 1
D 3 dollar 1 1
D 4 NA 1 1
F 2 NA 0 0
F 3 euro 0 1
F 3 dollar 1 1
F 4 NA 1 1
F 4 NA 1 1
")
output_format
# A tibble: 11 x 5
letter age payment dollar euro
<chr> <dbl> <chr> <dbl> <dbl>
1 A 2 NA 0 0
2 A 3 dollar 1 0
3 A 4 NA 1 0
4 D 2 euro 0 1
5 D 3 dollar 1 1
6 D 4 NA 1 1
7 F 2 NA 0 0
8 F 3 euro 0 1
9 F 3 dollar 1 1
10 F 4 NA 1 1
11 F 4 NA 1 1
Thanks. Edited.
You can also use the following tidyverse
solution:
library(dplyr)
library(tidyr)
library(stringr)
input_format %>%
mutate(id = row_number()) %>%
pivot_wider(names_from = payment, values_from = payment,
values_fn = length) %>%
select(- c(id, `NA`)) %>%
bind_cols(input_format$payment) %>%
rename_with(~ str_replace(., "\\.\\.\\.\\d+", "payment"), contains(fixed("..."))) %>%
relocate(letter, age, payment) %>%
group_by(letter) %>%
replace_na(list(dollar = 0, euro = 0)) %>%
mutate(across(dollar:euro, ~ cummax(.x))) -> input2
# A tibble: 11 x 5
# Groups: letter [3]
letter age payment dollar euro
<chr> <dbl> <chr> <dbl> <dbl>
1 A 2 NA 0 0
2 A 3 dollar 1 0
3 A 4 NA 1 0
4 D 2 euro 0 1
5 D 3 dollar 1 1
6 D 4 NA 1 1
7 F 2 NA 0 0
8 F 3 euro 0 1
9 F 3 dollar 1 1
10 F 4 NA 1 1
11 F 4 NA 1 1
After discussion in the comment you can use the following solution to get your desired output:
input2 %>%
group_by(letter, age) %>%
add_count() %>%
group_by(letter, age) %>%
filter((n == 2 & if_all(dollar:euro, ~ .x == 1)) | n == 1) %>%
select(-n) %>%
group_by(letter, age) %>%
add_count() %>%
group_split(letter, age) %>%
map_dfr(~ if(.x$n[1] == 2) {
.x %>% slice_tail(n = 1)
} else {
.x
})
# A tibble: 9 x 6
letter age payment dollar euro n
<chr> <dbl> <chr> <dbl> <dbl> <int>
1 A 2 NA 0 0 1
2 A 3 dollar 1 0 1
3 A 4 NA 1 0 1
4 D 2 euro 0 1 1
5 D 3 dollar 1 1 1
6 D 4 NA 1 1 1
7 F 2 NA 0 0 1
8 F 3 dollar 1 1 1
9 F 4 NA 1 1 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With