Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find last row in a sequence of consecutive values

I am working with a series of TRUE and FALSE sequences that are organized by group and stretching over some span of time (e.g. days). Below, I have taken the time to recreate an example and convert the data.frame into a data.table using the data.table package.

> dput(df)
structure(list(day = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 
4L, 5L, 6L, 7L), group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("a", "b"), class = "factor"), 
    condition = structure(c(1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 
    2L, 1L, 1L, 2L, 2L), .Label = c("FALSE", "TRUE"), class = "factor")), .Names = c("day", 
"group", "condition"), row.names = c(NA, -14L), class = "data.frame")
> library(data.table)
> df <- as.data.table(df)
> setkey(df, group)
> df
    day group condition
 1:   1     a     FALSE
 2:   2     a     FALSE
 3:   3     a      TRUE
 4:   4     a      TRUE
 5:   5     a      TRUE
 6:   6     a     FALSE
 7:   7     a     FALSE
 8:   1     b     FALSE
 9:   2     b      TRUE
10:   3     b      TRUE
11:   4     b     FALSE
12:   5     b     FALSE
13:   6     b      TRUE
14:   7     b      TRUE

Using data.table, I know how to select the first day for each group where condition is equal to "TRUE":

> df[condition == "TRUE", day[1], by = group]
   group V1
1:     a  3
2:     b  2
> df
    day group condition
 1:   1     a     FALSE
 2:   2     a     FALSE
 3:   3     a      TRUE     <~~~~~~~~
 4:   4     a      TRUE
 5:   5     a      TRUE
 6:   6     a     FALSE
 7:   7     a     FALSE
 8:   1     b     FALSE
 9:   2     b      TRUE     <~~~~~~~~
10:   3     b      TRUE
11:   4     b     FALSE
12:   5     b     FALSE
13:   6     b      TRUE
14:   7     b      TRUE

What I would like to accomplish is a final output that identifies the last row for the first sequence of each group where the condition is equal to "TRUE". Please see my output below:

       group V1
    1:     a  5
    2:     b  3
> df
    day group condition
 1:   1     a     FALSE
 2:   2     a     FALSE
 3:   3     a      TRUE
 4:   4     a      TRUE
 5:   5     a      TRUE     <~~~~~
 6:   6     a     FALSE
 7:   7     a     FALSE
 8:   1     b     FALSE
 9:   2     b      TRUE
10:   3     b      TRUE     <~~~~~
11:   4     b     FALSE
12:   5     b     FALSE
13:   6     b      TRUE
14:   7     b      TRUE

Any advice, suggestions, or other material would be much appreciated. Thank you for your time and feel free to ask any clarifying questions!

like image 711
ccapizzano Avatar asked Oct 01 '22 08:10

ccapizzano


1 Answers

Here we can look for changes in the condition, specifically a transition from TRUE to FALSE.

a <- df[diff(c(condition,1)) == -1, day[1], by=group]

#    group V1
# 1:     a  5
# 2:     b  3

Thus does assume that the condition column is a factor with FALSE as the first level.

like image 90
MrFlick Avatar answered Oct 03 '22 15:10

MrFlick