I would like to be able to keep NAs for only groups that have more than two entries and just want to leave alone any groups that have 1 entry (regardless if they have NAs or not). That is, if the group has two elements, keep only the NA. If it has one then just take whatever is there. Here is a reprex of the type of data I have:
library(dplyr)
data <- data.frame(
x = c(1, NA_real_, 3, NA_real_),
y = c("grp1", "grp2", "grp2", "grp3")
)
data
#> x y
#> 1 1 grp1
#> 2 NA grp2
#> 3 3 grp2
#> 4 NA grp3
Then here is the fairly ugly way I have achieved what I want:
raw <- data %>%
group_by(y) %>%
mutate(n = n())
results <- bind_rows(
raw %>%
filter(n == 2) %>%
filter(is.na(x)),
raw %>%
filter(n == 1)
) %>%
ungroup() %>%
select(-n)
results
#> # A tibble: 3 × 2
#> x y
#> <dbl> <chr>
#> 1 NA grp2
#> 2 1 grp1
#> 3 NA grp3
Update after clarification: We have only to add ! and change == to >:
library(dplyr)
data %>%
group_by(y) %>%
filter(!(!is.na(x) & n() > 1))
x y
<dbl> <chr>
1 1 grp1
2 NA grp2
3 NA grp3
We could define filter with the max(row_number():
update: instead of max(row_number() we could use n() (many thanks to @Juan C)
library(dplyr)
data %>%
group_by(y) %>%
filter(!(is.na(x) & n() == 1))
# filter(!(is.na(x) & max(row_number()) == 1))
x y
<dbl> <chr>
1 1 grp1
2 NA grp2
3 3 grp2
Here's another approach that works
data %>%
group_by(y) %>%
filter(!(!is.na(x) & n() > 1))
# A tibble: 3 × 2
# Groups: y [3]
x y
<dbl> <chr>
1 1 grp1
2 NA grp2
3 NA grp3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With