Count the number of values in a window of a data.table per group

Question

I'm trying to add new columns to data.table, where values in rows depend on the relative relationship of the values in the row. To be more precise, if there is a value X in a row, I would like to know how many other values are in the same column (and group), that are within X-30.

That is, given this:

DT<-data.table(
X = c(1, 2, 2, 1, 1, 2,  1, 2, 2, 1, 1, 1),
Y = c(100, 101, 133, 134, 150, 156,  190, 200, 201, 230, 233, 234),
Z = c(1, 2, 3, 4, 5, 6,  7, 8, 9, 10, 11, 12))

I would like to get a new column, with values:

N <- c(0, 0, 0, 0, 1, 1,  0, 0, 1, 0, 1, 2)

I've tried the following, but I don't get the results I could use:

DT[,list(Y,num=cumsum(Y[-.I]>DT[.I,Y]-30),Z),by=.(X)]

Any ideas how to do this?

David Arenburg · Accepted Answer

This is probably can be achieved with a rolling join (?), but here is a foverlaps alternative for now

DT[, `:=`(indx = .I, Y2 = Y - 30L, N = 0L)] # Add row index and a -30 interval
setkey(DT, X, Y2, Y) # Sort by X and the intervals (for fovelaps)
res <- foverlaps(DT, DT)[Y2 > i.Y2, .N, keyby = indx] # Run foverlaps and check what can we catch
setorder(DT, indx) # go back to the original order
DT[res$indx, N := res$N][, c("indx", "Y2") := NULL] # update results and remove cols
DT
#     X   Y  Z N
#  1: 1 100  1 0
#  2: 2 101  2 0
#  3: 2 133  3 0
#  4: 1 134  4 0
#  5: 1 150  5 1
#  6: 2 156  6 1
#  7: 1 190  7 0
#  8: 2 200  8 0
#  9: 2 201  9 1
# 10: 1 230 10 0
# 11: 1 233 11 1
# 12: 1 234 12 2

Alternately, use the which=TRUE option of foverlaps to make the overlap merge smaller:

# as above
DT[, `:=`(indx = .I, Y2 = Y - 30L, N = 0L)]
setkey(DT, X, Y2, Y)

# using which=TRUE:
res <- foverlaps(DT, DT, which=TRUE)[xid > yid, .N, by=xid]
DT[res$xid, N := res$N]
setorder(DT, indx)
DT[, c("Y2","indx") := NULL]

Frank · Answer

Here's another way:

DT[order(Y), N := 0:(.N-1) - findInterval(Y - 30, Y), by = X]

all.equal(DT$N,N) # TRUE

Count the number of values in a window of a data.table per group

Tags:

r

data.table

Nikola Knezevic

2 Answers

David Arenburg

Frank

Recent Activity

Donate For Us

Count the number of values in a window of a data.table per group

Tags:

r

data.table

Nikola Knezevic

2 Answers

David Arenburg

Frank

Related questions

Recent Activity

Donate For Us