I have a list of 3 dataframes that share some rows and columns
DATA
all_data <- list(questionnaireA = structure(list(name = structure(2:1, .Label = c("James",
"Shawn"), class = "factor"), banana = c(1, 0), grapes = c(1,
1), orange = c("AB", 1)), class = "data.frame", row.names = c(NA,
-2L)), questionnaireB = structure(list(name = structure(2:1, .Label = c("Chris",
"James"), class = "factor"), orange = c(1, 0), banana = c(1,
0)), class = "data.frame", row.names = c(NA, -2L)), questionnaireC = structure(list(
name = structure(3:1, .Label = c("Donald", "James", "Shawn"
), class = "factor"), banana = c(1, 0, 0), raisins = c(1,
1, 1), grapes = c(1, 1, 0), cake = c(0, 1, 0)), class = "data.frame", row.names = c(NA, -3L)))
$questionnaireA name banana grapes orange 1 Shawn 1 1 AB 2 James 0 1 1 $questionnaireB name orange banana 1 James 1 1 2 Chris 0 0 $questionnaireC name banana raisins grapes cake 1 Shawn 1 1 1 0 2 James 0 1 1 1 3 Donald 0 1 0 0
library(tidyverse)
map(all_data, ~ .x %>%
pivot_longer(cols=-name, names_to="fruit"))
Any help would be appreciated!
If we follow the similar option as in OP's attempted method, i.e. reshape to 'long' format each of the datasets in the list, then loop with imap, create the name of the list1 element as new column, use pivot_longer to reshape into 'long' format, then create a sequence column by group and reshape into 'wide' format with pivot_wider
library(dplyr)
library(tidyr)
library(purrr)
imap_dfr(all_data, ~
.x %>%
mutate(grp = .y) %>%
pivot_longer(cols = -c(name, grp),
names_to = "fruit", values_to = "Value")) %>%
#group_by(name, grp, fruit) %>%
#mutate(rn = row_number()) %>%
pivot_wider(names_from = grp, values_from = Value)
Or do this more efficiently, by binding all the datasets to a single data with bind_cols, do the pivot_longer while removing the missing values with value_drop_na = TRUE and then do the same as in the above solution
bind_rows(all_data, .id = 'grp') %>%
pivot_longer(cols = c(-name, -grp), names_to = "fruit",
values_to = "Value", values_drop_na = TRUE) %>%
# sequence column creation is not really required for the example
# as there are no duplicates
#group_by(name, grp, fruit) %>%
#mutate(rn = row_number()) %>%
pivot_wider(names_from = grp, values_from = Value)
Based on the new data with some mix in column types, if we need to have the values like "AB" remain as such, it needs to be converted to character class
imap_dfr(all_data, ~
.x %>%
mutate_at(-1, as.character) %>%
mutate(grp = .y) %>%
pivot_longer(cols = -c(name, grp), names_to = "fruit",
values_to = "Value")) %>%
pivot_wider(names_from = grp, values_from = Value)
Or an efficient approach similar to the bind_rows earlier (but it can't be done here because the column types are different)
map_dfr(all_data, ~
.x %>%
mutate_at(-1, as.character), .id = 'grp') %>%
pivot_longer(cols = c(-name, -grp), names_to = "fruit",
values_to = "Value", values_drop_na = TRUE) %>%
pivot_wider(names_from = grp, values_from = Value)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With