I am working with a series of TRUE and FALSE sequences that are organized by group and stretching over some span of time (e.g. days). Below, I have taken the time to recreate an example and convert the data.frame
into a data.table
using the data.table
package.
> dput(df)
structure(list(day = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L,
4L, 5L, 6L, 7L), group = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("a", "b"), class = "factor"),
condition = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L,
2L, 1L, 1L, 2L, 2L), .Label = c("FALSE", "TRUE"), class = "factor")), .Names = c("day",
"group", "condition"), row.names = c(NA, -14L), class = "data.frame")
> library(data.table)
> df <- as.data.table(df)
> setkey(df, group)
> df
day group condition
1: 1 a FALSE
2: 2 a FALSE
3: 3 a TRUE
4: 4 a TRUE
5: 5 a TRUE
6: 6 a FALSE
7: 7 a FALSE
8: 1 b FALSE
9: 2 b TRUE
10: 3 b TRUE
11: 4 b FALSE
12: 5 b FALSE
13: 6 b TRUE
14: 7 b TRUE
Using data.table
, I know how to select the first day for each group where condition is equal to "TRUE":
> df[condition == "TRUE", day[1], by = group]
group V1
1: a 3
2: b 2
> df
day group condition
1: 1 a FALSE
2: 2 a FALSE
3: 3 a TRUE <~~~~~~~~
4: 4 a TRUE
5: 5 a TRUE
6: 6 a FALSE
7: 7 a FALSE
8: 1 b FALSE
9: 2 b TRUE <~~~~~~~~
10: 3 b TRUE
11: 4 b FALSE
12: 5 b FALSE
13: 6 b TRUE
14: 7 b TRUE
What I would like to accomplish is a final output that identifies the last row for the first sequence of each group where the condition is equal to "TRUE". Please see my output below:
group V1
1: a 5
2: b 3
> df
day group condition
1: 1 a FALSE
2: 2 a FALSE
3: 3 a TRUE
4: 4 a TRUE
5: 5 a TRUE <~~~~~
6: 6 a FALSE
7: 7 a FALSE
8: 1 b FALSE
9: 2 b TRUE
10: 3 b TRUE <~~~~~
11: 4 b FALSE
12: 5 b FALSE
13: 6 b TRUE
14: 7 b TRUE
Any advice, suggestions, or other material would be much appreciated. Thank you for your time and feel free to ask any clarifying questions!
Here we can look for changes in the condition, specifically a transition from TRUE to FALSE.
a <- df[diff(c(condition,1)) == -1, day[1], by=group]
# group V1
# 1: a 5
# 2: b 3
Thus does assume that the condition column is a factor with FALSE as the first level.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With