I am having some trouble doing a simple operation.
What I want is to simply select the rows, grouped by id
, that have all of the values (-1, -2, -3, -4)
, and apply a tag based on this condition.
id time
1 1 -4
2 1 -3
3 1 -2
4 1 -1
5 2 -1
6 2 -2
7 3 -1
8 3 -3
9 4 -1
10 4 -2
11 4 -3
12 4 -4
Intuitively I was going for
rb[rb$time %in% c(-1, -2, -3, -4), ]
However, when I do that, all id
are selected. So, the %in%
operator uses the OR
|
operator.
So, I thought I could do this then, to use the AND &
operator
rb[rb$time == -1 &
rb$time == -2 &
rb$time == -3 &
rb$time == -4, ]
But it does not work.
Any clue how I could tag or subset these id
, such that
id time tag
1 1 -4 1
2 1 -3 1
3 1 -2 1
4 1 -1 1
5 2 -1 0
6 2 -2 0
7 3 -1 0
8 3 -3 0
9 4 -1 1
10 4 -2 1
11 4 -3 1
12 4 -4 1
One dplyr
possibility could be:
df %>%
group_by(id) %>%
mutate(tag = all(c(-1, -2, -3, -4) %in% time) * 1)
id time tag
<int> <int> <dbl>
1 1 -4 1
2 1 -3 1
3 1 -2 1
4 1 -1 1
5 2 -1 0
6 2 -2 0
7 3 -1 0
8 3 -3 0
9 4 -1 1
10 4 -2 1
11 4 -3 1
12 4 -4 1
And the same with base R
could be:
with(df, ave(time, id, FUN = function(x) all(c(-1, -2, -3, -4) %in% x) * 1))
One of many base R options is to use aggregate
(we could also rename and convert to integer to get 0 or 1, but booleans seem better fit for the purpose):
tags <- aggregate(df$time, by = df["id"], FUN = function(el) all(-1:-4 %in% el))
df <- merge(df, tags)
Result:
df
# id time x
# 1 1 -4 TRUE
# 2 1 -3 TRUE
# 3 1 -2 TRUE
# 4 1 -1 TRUE
# 5 2 -1 FALSE
# 6 2 -2 FALSE
# 7 2 -1 FALSE
# 8 2 -3 FALSE
# 9 3 -1 TRUE
# 10 3 -2 TRUE
# 11 3 -3 TRUE
# 12 3 -4 TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With