I want to filter a data.table for dates - given start- and end-dates, I want to have all the rows which include a given date.
Now my problem: It works when I use a fixed date, but it returns an empty table when I want to store the date in an external variable. Any hints?
dt = data.table(begin=as.Date('2014-01-01'):as.Date('2014-01-10'),
end=as.Date('2014-01-01'):as.Date('2014-01-10')+c(1,1,1,2:8),
x=c('A','B','C','D','E','J','J','J','J','J'))
dt[,`:=`(begin_idate=as.IDate(begin, origin='1970-1-1'),
end_idate=as.IDate(end, origin= '1970-1-1'))]
dt[as.Date('2014-01-09')>begin_idate & as.Date('2014-01-09')<=end_idate ] # works
x=as.Date('2014-01-09')
dt[x>begin_idate & x<=end_idate ] #doesnt' work - empty data.table
This is due to the fact, that one of the columns of your data.table
is named x
. As data.table
evaluates its first argument (the one you are using) it looks for x
as column reference first. So in fact x>begin_idate
is interpreted as
dt$x > dt$begin_idate
Which throws an error.
To fix this use a name that is not already a column name of dt
. For example
xx = as.Date('2014-01-09')
dt[xx>begin_idate & xx<=end_idate ]
Result:
> dt[xx>begin_idate & xx<=end_idate ]
begin end x begin_idate end_idate
1: 16076 16080 J 2014-01-06 2014-01-10
2: 16077 16082 J 2014-01-07 2014-01-12
3: 16078 16084 J 2014-01-08 2014-01-14
In fact the results are equal:
all.equal(dt[as.Date('2014-01-09')>begin_idate & as.Date('2014-01-09')<=end_idate ],
dt[xx>begin_idate & xx<=end_idate ]) # TRUE
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With