Identify records where a given sequence of events happens within x days

Question

I have a large data.table, similar in structure to df:

library("data.table")
df <- data.frame(part = c("A", "B", "A", "C", "A", "D", "B", "D", "E"), 
                 day = c(1, 2, 3, 4, 5, 6, 6, 7, 15), 
                 code = c("S", "S", "P", "X", "P", "S", "P", "P", "P"))
setDT(df)
df
   part day code
1:    A   1    S
2:    B   2    S
3:    A   3    P
4:    C   4    X
5:    A   5    P
6:    D   6    S
7:    B   6    P
8:    D   7    P
9:    E  15    P

How can I add a column that flags records where code = S and the same part has code = P within 3 subsequent days? Expected result:

   part day code  flag
1:    A   1    S  TRUE
2:    B   2    S FALSE
3:    A   3    P FALSE
4:    C   4    X FALSE
5:    A   5    P FALSE
6:    D   6    S  TRUE
7:    B   6    P FALSE
8:    D   7    P FALSE
9:    E  15    P FALSE

Frank · Accepted Answer

I think this does it

df[, v := FALSE ]
df[code == "S", v := !is.na(
  df[code == "P"][df[code == "S"], on=c("part", "day"), roll=-3, which=TRUE]
)]

   part day code     v
1:    A   1    S  TRUE
2:    B   2    S FALSE
3:    A   3    P FALSE
4:    C   4    X FALSE
5:    A   5    P FALSE
6:    D   6    S  TRUE
7:    B   6    P FALSE
8:    D   7    P FALSE
9:    E  15    P FALSE

How it works !is.na(x[i, which=TRUE]) tells us whether each row of i found a match in x. (This might break down if i finds multiple matches.) The roll part expands the scope of matches to cover rows that are nearby in terms of the last column being joined on.

I'm not sure that the roll value there is correct since I have never used it this way.

code_is_entropy · Answer

This can work too-

df$v <- as.logical((df$code== "S") *
        c(sapply(seq(1:(nrow(df)-2)), function(x)
           {
            max(df[(x:x+2),"code"] == "P")
           }),
          df[nrow(df)-1,"code"]=="P",
          df[nrow(df),"code"]=="P"))

Identify records where a given sequence of events happens within x days

Tags:

r

data.table

ddunn801

2 Answers

Frank

code_is_entropy

Recent Activity

Donate For Us

Identify records where a given sequence of events happens within x days

Tags:

r

data.table

ddunn801

2 Answers

Frank

code_is_entropy

Related questions

Recent Activity

Donate For Us