Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add a countdown column to data.table containing rows until a special row encountered

Tags:

r

data.table

I have a data.table with ordered data labled up, and I want to add a column that tells me how many records until I get to a "special" record that resets the countdown.

For example:

DT = data.table(idx = c(1,3,3,4,6,7,7,8,9), 
                name = c("a", "a", "a", "b", "a", "a", "b", "a", "b"))
setkey(DT, idx)
#manually add the answer
DT[, countdown := c(3,2,1,0,2,1,0,1,0)]

Gives

> DT
   idx name countdown
1:   1    a         3
2:   3    a         2
3:   3    a         1
4:   4    b         0
5:   6    a         2
6:   7    a         1
7:   7    b         0
8:   8    a         1
9:   9    b         0

See how the countdown column tells me how many rows until a row called "b". The question is how to create that column in code.

Note that the key is not evenly spaced and may contain duplicates (so is not very useful in solving the problem). In general the non-b names could be different, but I could add a dummy column that is just True/False if the solution requires this.

like image 836
Corvus Avatar asked Mar 05 '13 18:03

Corvus


2 Answers

Here's another idea:

## Create groups that end at each occurrence of "b"
DT[, cd:=0L]
DT[name=="b", cd:=1L]
DT[, cd:=rev(cumsum(rev(cd)))]
## Count down within them
DT[, cd:=max(.I) - .I, by=cd]
#    idx name cd
# 1:   1    a  3
# 2:   3    a  2
# 3:   3    a  1
# 4:   4    b  0
# 5:   6    a  2
# 6:   7    a  1
# 7:   7    b  0
# 8:   8    a  1
# 9:   9    b  0
like image 52
Josh O'Brien Avatar answered Nov 19 '22 21:11

Josh O'Brien


I'm sure (or at least hopeful) that a purely "data.table" solution would be generated, but in the meantime, you could make use of rle. In this case, you're interested in reversing the countdown, so we'll use rev to reverse the "name" values before proceeding.

output <- sequence(rle(rev(DT$name))$lengths)
makezero <- cumsum(rle(rev(DT$name))$lengths)[c(TRUE, FALSE)]
output[makezero] <- 0

DT[, countdown := rev(output)]
DT
#    idx name countdown
# 1:   1    a         3
# 2:   3    a         2
# 3:   3    a         1
# 4:   4    b         0
# 5:   6    a         2
# 6:   7    a         1
# 7:   7    b         0
# 8:   8    a         1
# 9:   9    b         0
like image 6
A5C1D2H2I1M1N2O1R2T1 Avatar answered Nov 19 '22 20:11

A5C1D2H2I1M1N2O1R2T1