I would like to modify a data.table
within a function. If I use the :=
feature within the function, the result is only printed for the second call.
Look at the following illustration:
library(data.table)
mydt <- data.table(x = 1:3, y = 5:7)
myfunction <- function(dt) {
dt[, z := y - x]
dt
}
When I call only the function, the table is not printed (which is the standard behaviour. However, if I save the returned data.table
into a new object, it is not printed at the first call, only for the second one.
myfunction(mydt) # nothing is printed
result <- myfunction(mydt)
result # nothing is printed
result # for the second time, the result is printed
mydt
# x y z
# 1: 1 5 4
# 2: 2 6 4
# 3: 3 7 4
Could you explain why this happens and how to prevent it?
As David Arenburg mentions in a comment, the answer can be found here. There was a bug fixed in the version 1.9.6 but the fix introduced this downside.
One should call DT[]
at the end of the function to prevent this behaviour.
myfunction <- function(dt) {
dt[, z := y - x][]
}
myfunction(mydt) # prints immediately
# x y z
# 1: 1 5 4
# 2: 2 6 4
# 3: 3 7 4
This is described in data.table
FAQ 2.23:
Why do I have to type
DT
sometimes twice after using:=
to print the result to console?
This is an unfortunate downside to get #869 to work. If a
:=
is used inside a function with noDT[]
before the end of the function, then the next timeDT
is typed at the prompt, nothing will be printed. A repeatedDT
will print. To avoid this: include aDT[]
after the last:=
in your function. If that is not possible (e.g., it's not a function you can change) thenprint(DT)
andDT[]
at the prompt are guaranteed to print. As before, adding an extra[]
on the end of:=
query is a recommended idiom to update and then print; e.g.>DT[,foo:=3L][]
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With