I have a data frame which looks like this:
Status ID
A 1
B 1
B 1
A 1
B 1
A 1
A 2
A 2
A 2
A 2
B 3
B 3
B 3
To illustrate my desired output, please have a look at below:
Status ID
B 1
B 1
B 1
A 2
A 2
A 2
A 2
B 3
B 3
B 3
As you can see, the only thing that changes is for group ID = 1. If a group contains both a "A" and "B" status, I'd like to remove the "A" status.
However, Group ID 2 and 3 did not change (ie no lines removed) because: if each group ID only contains a "A", then it will remain the same. Likewise, if each group ID only contains a "B", it will also remain the same. Hence both remains the same.
Using dplyr, this is my attempt:
library(dplyr)
df1_clean <- df1 %>% group_by(ID, Status)
%>% filter(ifelse((Status == A | Status == B), Status == B,
ifelse((Status == A), Status == A,
ifelse((Status == B), Status == B))))
However, this filter will not work. Any help would be appreciated!
We can use filter
grouped by ID
library(dplyr)
df %>%
group_by(ID) %>%
filter(all(Status == "A") | all(Status == "B") | Status == "B")
# Status ID
# <fct> <int>
# 1 B 1
# 2 B 1
# 3 B 1
# 4 A 2
# 5 A 2
# 6 A 2
# 7 A 2
# 8 B 3
# 9 B 3
#10 B 3
We can also use n_distinct
df %>%
group_by(ID) %>%
filter(n_distinct(Status) == 1 | Status == "B")
Equivalent base R ave
versions would be
df[as.logical(with(df, ave(Status, ID, FUN = function(x)
all(x == "A") | all(x == "B") | x == "B"))), ]
df[as.logical(with(df, ave(Status, ID, FUN = function(x)
length(unique(x)) == 1 | x == "B"))), ]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With