I have a data frame that looks like this: <pre class="prettyprint"><code>ID pick1 pick2 pick3 1 NA 21/11/29 21/11/30 2 21/11/28 21/11/29 NA 3 21/11/28 NA 21/11/30 4 NA 21/11/29 21/11/30 </code></pre> Each participant (ID) could pick 2 dates out of 3 options. Now I want to summarize the selected dates to get a tibble like this: <pre class="prettyprint"><code>ID date1 date2 1 21/11/29 21/11/30 2 21/11/28 21/11/29 3 21/11/28 21/11/30 4 21/11/29 21/11/30 </code></pre> However, I can't get it working using tidyverse functions only. I have started to use this library but couldn't find a solution for my issue online

One option is with <code>rowwise</code> - group by rows, do the <code>sort</code> with <code>na.last</code> as TRUE, keep the sorted output in a <code>list</code>, <code>unnest</code> to multiple columns, and <code>select</code> only columns having at least one non-NA elements <pre class="prettyprint"><code>library(dplyr) library(tidyr) library(stringr) df1 %>% rowwise %>% transmute(ID, date = list(sort(c_across(starts_with('pick')), na.last = TRUE))) %>% ungroup %>% unnest_wider(date) %>% rename_with(~ str_c('date', seq_along(.)), -ID) %>% select(where(~ any(!is.na(.)))) </code></pre> -output <pre class="prettyprint"><code># A tibble: 4 × 3 ID date1 date2 <int> <chr> <chr> 1 1 21/11/29 21/11/30 2 2 21/11/28 21/11/29 3 3 21/11/28 21/11/30 4 4 21/11/29 21/11/30 </code></pre> <hr> or reshape to 'long' format with <code>pivot_longer</code> remove the <code>NA</code>s and reshape it back to 'wide' format <pre class="prettyprint"><code>library(stringr) df1 %>% pivot_longer(cols = -ID, values_drop_na = TRUE) %>% group_by(ID) %>% mutate(name = str_c('date', row_number())) %>% ungroup %>% pivot_wider(names_from = name, values_from = value) </code></pre> -output <pre class="prettyprint"><code># A tibble: 4 × 3 ID date1 date2 <int> <chr> <chr> 1 1 21/11/29 21/11/30 2 2 21/11/28 21/11/29 3 3 21/11/28 21/11/30 4 4 21/11/29 21/11/30 </code></pre> <h3>data</h3> <pre class="prettyprint"><code>df1 <- structure(list(ID = 1:4, pick1 = c(NA, "21/11/28", "21/11/28", NA), pick2 = c("21/11/29", "21/11/29", NA, "21/11/29"), pick3 = c("21/11/30", NA, "21/11/30", "21/11/30")), class = "data.frame", row.names = c(NA, -4L)) </code></pre>

Tidyverse: Reduce variables by group

Tags:

r

tidyverse

I have a data frame that looks like this:

ID  pick1      pick2     pick3
1   NA         21/11/29  21/11/30
2   21/11/28   21/11/29  NA
3   21/11/28   NA        21/11/30   
4   NA         21/11/29  21/11/30

Each participant (ID) could pick 2 dates out of 3 options. Now I want to summarize the selected dates to get a tibble like this:

ID  date1      date2
1   21/11/29   21/11/30
2   21/11/28   21/11/29
3   21/11/28   21/11/30   
4   21/11/29   21/11/30

However, I can't get it working using tidyverse functions only. I have started to use this library but couldn't find a solution for my issue online

436

asked Nov 29 '21 17:11

diggi2395

Video Answer

1 Answers

One option is with rowwise - group by rows, do the sort with na.last as TRUE, keep the sorted output in a list, unnest to multiple columns, and select only columns having at least one non-NA elements

library(dplyr)
library(tidyr)
library(stringr)
 df1 %>% 
   rowwise %>% 
   transmute(ID, date = list(sort(c_across(starts_with('pick')), 
       na.last = TRUE))) %>% 
   ungroup %>%
   unnest_wider(date) %>%
   rename_with(~ str_c('date', seq_along(.)), -ID) %>%
   select(where(~ any(!is.na(.))))

-output

# A tibble: 4 × 3
     ID date1    date2   
  <int> <chr>    <chr>   
1     1 21/11/29 21/11/30
2     2 21/11/28 21/11/29
3     3 21/11/28 21/11/30
4     4 21/11/29 21/11/30

or reshape to 'long' format with pivot_longer remove the NAs and reshape it back to 'wide' format

library(stringr)
df1 %>% 
   pivot_longer(cols = -ID, values_drop_na = TRUE) %>%
   group_by(ID) %>% 
   mutate(name = str_c('date', row_number())) %>%
   ungroup %>% 
   pivot_wider(names_from = name, values_from = value)

-output

# A tibble: 4 × 3
     ID date1    date2   
  <int> <chr>    <chr>   
1     1 21/11/29 21/11/30
2     2 21/11/28 21/11/29
3     3 21/11/28 21/11/30
4     4 21/11/29 21/11/30

data

df1 <- structure(list(ID = 1:4, pick1 = c(NA, "21/11/28", "21/11/28", 
NA), pick2 = c("21/11/29", "21/11/29", NA, "21/11/29"), pick3 = c("21/11/30", 
NA, "21/11/30", "21/11/30")), class = "data.frame",
 row.names = c(NA, 
-4L))

193

answered Oct 19 '22 22:10

akrun

Related questions
                            
                                R shiny dynamic UI in insertUI
                            
                                How to convert a numeric value into a Date value
                            
                                How to filter an R simple features collection using sf methods like st_intersects()?
                            
                                R return true or false per row if string contains any of a list of words
                            
                                How to find the number of times row elements switch from negative to positive (cycles) for each factor level
                            
                                Replacement of plyr::cbind.fill in dplyr?
                            
                                Left-adjust (hjust = 0) vertical x axis labels on facets with free scale
                            
                                How to group rows and get their cell associations layed out in a list form in r?
                            
                                How to establish if the dates in a column are unique?
                            
                                Cumulative product of (1-previous_record)*current_record
                            
                                zsh: command not found: R on terminal using Big Sur Mac
                            
                                How to identify row that matches vector
                            
                                R repeat in column based on value in row
                            
                                R: pass multiple arguments to accumulate/reduce
                            
                                Last observation added forward
                            
                                Mutate new column based on moving window of fixed date interval size, in R
                            
                                Is there an R function to sequentially assign a code to each value in a dataframe, in the order it appears within the dataset?
                            
                                Find overlaps in time intervals by group and return subsetted data.frame
                            
                                How to blur part of a plot in ggplot?
                            
                                How to substract multiple .x from .y with same prefixes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With