Similar to this question but with an added twist:
Given the following data frame:
txt <- "ID Col1 Col2 Col3 Col4
1 6 10 NA NA
1 5 10 NA NA
1 NA 10 15 20
2 17 25 NA NA
2 13 25 NA NA
2 NA 25 21 34
2 NA 25 35 40"
DF <- read.table(text = txt, header = TRUE)
DF
ID Col1 Col2 Col3 Col4
1 1 6 10 NA NA
2 1 5 10 NA NA
3 1 NA 10 15 20
4 2 17 25 NA NA
5 2 13 25 NA NA
6 2 NA 25 21 34
7 2 NA 25 35 40
I wish to collapse the rows by group ID (analogous to Col2 in this example), and when more than 1 combination is present per group, to return all combinations, as so:
ID Col1 Col2 Col3 Col4
1 1 6 10 15 20
2 1 5 10 15 20
3 2 17 25 21 34
4 2 13 25 21 34
5 2 17 25 35 40
6 2 13 25 35 40
Importantly, down the road I'll need this to work on non-numerical data. Any suggestions? Thanks!
Grouped by 'ID', fill other columns, ungroup to remove the group attribute and keep the distinct rows
library(dplyr)
library(tidyr)
DF %>%
group_by(ID) %>%
fill(everything(), .direction = 'updown') %>%
ungroup %>%
distinct(.keep_all = TRUE)
Or may also be
DF %>%
group_by(ID) %>%
mutate(across(everything(), ~ replace(., is.na(.),
rep(.[!is.na(.)], length.out = sum(is.na(.))))))
Or based on the comments
DF %>%
group_by(ID) %>%
mutate(across(where(~ any(is.na(.))), ~ {
i1 <- is.na(.)
ind <- which(i1)
i2 <- !i1
if(i1[1] == 1) rep(.[i2], each = n()/sum(i2)) else
rep(.[i2], length.out = n())
})) %>%
ungroup %>%
distinct(.keep_all = TRUE)
-output
# A tibble: 6 x 5
ID Col1 Col2 Col3 Col4
<int> <int> <int> <int> <int>
1 1 6 10 15 20
2 1 5 10 15 20
3 2 17 25 21 34
4 2 13 25 21 34
5 2 17 25 35 40
6 2 13 25 35 40
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With