I have a data frame that looks like this:
ID pick1 pick2 pick3
1 NA 21/11/29 21/11/30
2 21/11/28 21/11/29 NA
3 21/11/28 NA 21/11/30
4 NA 21/11/29 21/11/30
Each participant (ID) could pick 2 dates out of 3 options. Now I want to summarize the selected dates to get a tibble like this:
ID date1 date2
1 21/11/29 21/11/30
2 21/11/28 21/11/29
3 21/11/28 21/11/30
4 21/11/29 21/11/30
However, I can't get it working using tidyverse functions only. I have started to use this library but couldn't find a solution for my issue online
When working with categorical variables, you may use the group_by() method to divide the data into subgroups based on the variable's distinct categories. You can group by a single variable or by giving in multiple variable names to group by several variables.
One option is with rowwise
- group by rows, do the sort
with na.last
as TRUE, keep the sorted output in a list
, unnest
to multiple columns, and select
only columns having at least one non-NA elements
library(dplyr)
library(tidyr)
library(stringr)
df1 %>%
rowwise %>%
transmute(ID, date = list(sort(c_across(starts_with('pick')),
na.last = TRUE))) %>%
ungroup %>%
unnest_wider(date) %>%
rename_with(~ str_c('date', seq_along(.)), -ID) %>%
select(where(~ any(!is.na(.))))
-output
# A tibble: 4 × 3
ID date1 date2
<int> <chr> <chr>
1 1 21/11/29 21/11/30
2 2 21/11/28 21/11/29
3 3 21/11/28 21/11/30
4 4 21/11/29 21/11/30
or reshape to 'long' format with pivot_longer
remove the NA
s and reshape it back to 'wide' format
library(stringr)
df1 %>%
pivot_longer(cols = -ID, values_drop_na = TRUE) %>%
group_by(ID) %>%
mutate(name = str_c('date', row_number())) %>%
ungroup %>%
pivot_wider(names_from = name, values_from = value)
-output
# A tibble: 4 × 3
ID date1 date2
<int> <chr> <chr>
1 1 21/11/29 21/11/30
2 2 21/11/28 21/11/29
3 3 21/11/28 21/11/30
4 4 21/11/29 21/11/30
df1 <- structure(list(ID = 1:4, pick1 = c(NA, "21/11/28", "21/11/28",
NA), pick2 = c("21/11/29", "21/11/29", NA, "21/11/29"), pick3 = c("21/11/30",
NA, "21/11/30", "21/11/30")), class = "data.frame",
row.names = c(NA,
-4L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With