I'm looking for behavior similar to inserting into an already keyed SQL table, where the new rows added are inserted into existing keys. For example, in this case:
dt <- data.table(a=1:10)
setkey(dt, a)
tables()
# NAME NROW MB COLS KEY
# [1,] dt 10 1 a a
dt.2 <- rbindlist(list(dt, data.table(a=1:5)))
tables()
# NAME NROW MB COLS KEY
# [1,] dt 10 1 a a
# [2,] dt.2 15 1 a
i would like to have the option of having dt.2
"inherit" the key (updated with the incremental data, obviously) from dt
, instead of having no key as actually happened.
I was at first a bit surprised at the loss of the key in the first place, but that is clearly the documented behavior.
Is there a clean way of doing this without calling setkey
after each rbind
/rbindlist
?
Essentially, data.table
doesn't currently support row insert at all, let alone into a keyed table. rbind
creates a new data.table
so isn't fast or memory efficient.
A similar question is here :
How to delete a row by reference in data.table?
Currently, the typical workflow is to load files from disk using fread
and rbindlist
them together, or load data from a database using RODBC or similar.
We'd like to add fast row insert, but it isn't done yet.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With