Is the 'sorted' attribute part of the official data.table API?
I frequently do things like derive a week/month/quarter/year variable from a date variable, which of course is a monotonic transformation. I then do things with by using one of these monotonically-derived variables.
I'm wondering if it is safe to directly replace my date variable with the name of the week/month/etc. variables in the sorted attribute and have things work properly? i.e. is the below safe to do:
library(data.table)
library(lubridate)
DT <- data.table(day=as.Date(c('2006-01-30', '2006-01-31', '2006-02-01', '2006-02-02')),
d=1:4, key='day')
DT[, month := floor_date(day, unit='month')]
# is this safe?
attr(DT, 'sorted') <- 'month'
I couldn't figure out if there were some other underlying data structures that reference into the table that might cause problems with this technique.
Yes, I use that trick all the time when I'm sure that the data is sorted, but use setattr
instead to avoid a copy:
setattr(DT, 'sorted', 'month')
If you look at the code of setkeyv
you'll see that's exactly what it does - sorts the data and then sets the "sorted" attribute.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With