Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create all possible combinations of non-NA values for each group ID

Similar to this question but with an added twist:

Given the following data frame:

txt <- "ID    Col1    Col2    Col3    Col4
        1     6       10      NA      NA
        1     5       10      NA      NA
        1     NA      10      15      20
        2     17      25      NA      NA
        2     13      25      NA      NA
        2     NA      25      21      34
        2     NA      25      35      40"
DF <- read.table(text = txt, header = TRUE)

DF
  ID Col1 Col2 Col3 Col4
1  1    6   10   NA   NA
2  1    5   10   NA   NA
3  1   NA   10   15   20
4  2   17   25   NA   NA
5  2   13   25   NA   NA
6  2   NA   25   21   34
7  2   NA   25   35   40

I wish to collapse the rows by group ID (analogous to Col2 in this example), and when more than 1 combination is present per group, to return all combinations, as so:

  ID Col1 Col2 Col3 Col4
1  1    6   10   15   20
2  1    5   10   15   20
3  2   17   25   21   34
4  2   13   25   21   34
5  2   17   25   35   40
6  2   13   25   35   40

Importantly, down the road I'll need this to work on non-numerical data. Any suggestions? Thanks!

like image 523
Aaron Avatar asked Dec 19 '25 16:12

Aaron


1 Answers

Grouped by 'ID', fill other columns, ungroup to remove the group attribute and keep the distinct rows

library(dplyr)
library(tidyr)
DF %>% 
    group_by(ID) %>% 
    fill(everything(), .direction = 'updown') %>%
    ungroup %>% 
    distinct(.keep_all = TRUE)

Or may also be

DF %>% 
   group_by(ID) %>% 
   mutate(across(everything(), ~ replace(., is.na(.), 
           rep(.[!is.na(.)], length.out = sum(is.na(.))))))

Or based on the comments

DF %>%
   group_by(ID) %>%
   mutate(across(where(~ any(is.na(.))), ~ {
        i1 <- is.na(.)
        ind <- which(i1)
        i2 <- !i1
        if(i1[1] == 1) rep(.[i2], each = n()/sum(i2)) else 
               rep(.[i2], length.out = n())
     })) %>%
   ungroup %>% 
   distinct(.keep_all = TRUE)

-output

# A tibble: 6 x 5
     ID  Col1  Col2  Col3  Col4
  <int> <int> <int> <int> <int>
1     1     6    10    15    20
2     1     5    10    15    20
3     2    17    25    21    34
4     2    13    25    21    34
5     2    17    25    35    40
6     2    13    25    35    40
like image 152
akrun Avatar answered Dec 21 '25 08:12

akrun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!