I am using setDT() to add additional columns to a data.table but
setDT(mydata)[, paste0('F2_E',2:30) := lapply(.SD, function(x) log(value/x)), .SDcols = 32:60][]
is not being added when you run this script:
library(data.table)
library(zoo)
date = seq(as.Date("2016-01-01"),as.Date("2016-05-10"),"day")
value =seq(1,131,1)
mydata = data.frame (date, value)
mydata
setDT(mydata)[, paste0('F1',2:30) := lapply(2:30, function(x) rollmeanr(value, x, fill = rep(NA,x-1)) ),][]
setDT(mydata)[, paste0('F2',2:30) := lapply(2:30, function(x) rollapply(value,x,FUN="median",align="right",fill=NA))][]
setDT(mydata)[, paste0('F1_E',2:30) := lapply(.SD, function(x) log(value/x) ), .SDcols = 3:31][]
setDT(mydata)[, paste0('F2_E',2:30) := lapply(.SD, function(x) log(value/x)), .SDcols = 32:60][]
rbind(colnames(mydata))
rbind(colnames(mydata))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27]
[1,] "date" "value" "F12" "F13" "F14" "F15" "F16" "F17" "F18" "F19" "F110" "F111" "F112" "F113" "F114" "F115" "F116" "F117" "F118" "F119" "F120" "F121" "F122" "F123" "F124" "F125" "F126"
[,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] [,36] [,37] [,38] [,39] [,40] [,41] [,42] [,43] [,44] [,45] [,46] [,47] [,48] [,49] [,50] [,51] [,52] [,53] [,54]
[1,] "F127" "F128" "F129" "F130" "F22" "F23" "F24" "F25" "F26" "F27" "F28" "F29" "F210" "F211" "F212" "F213" "F214" "F215" "F216" "F217" "F218" "F219" "F220" "F221" "F222" "F223" "F224"
[,55] [,56] [,57] [,58] [,59] [,60] [,61] [,62] [,63] [,64] [,65] [,66] [,67] [,68] [,69] [,70] [,71] [,72] [,73] [,74] [,75] [,76] [,77]
[1,] "F225" "F226" "F227" "F228" "F229" "F230" "F1_E2" "F1_E3" "F1_E4" "F1_E5" "F1_E6" "F1_E7" "F1_E8" "F1_E9" "F1_E10" "F1_E11" "F1_E12" "F1_E13" "F1_E14" "F1_E15" "F1_E16" "F1_E17" "F1_E18"
[,78] [,79] [,80] [,81] [,82] [,83] [,84] [,85] [,86] [,87] [,88] [,89]
[1,] "F1_E19" "F1_E20" "F1_E21" "F1_E22" "F1_E23" "F1_E24" "F1_E25" "F1_E26" "F1_E27" "F1_E28" "F1_E29" "F1_E30"
You can see there are no F2_E2, F2_E3,etc... columns.
Why would those columns not be added?
Short answer:
Use setDT(mydata) once, and separately. Then do all your assignment statements.
Additionally, if you're going to add a lot of columns use the function alloc.col() to over-allocate more slots up-front until next release (v1.9.8). i.e.,
setDT(mydata)
truelength(mydata) # [1] 100
alloc.col(mydata, 1000L)
truelength(mydata) # [1] 1000
In the current development version, v1.9.7, we've increased the over-allocation to 1024, by default. So this should happen extremely rarely.
A quick and slightly detailed explanation:
This happens because data.table over-allocates column pointers during its creation, and the default over-allocation length is 100 columns. You can check this with truelength(). See ?truelength.
require(data.table)
mydata = data.frame (x=1, y=2)
setDT(mydata) ## convert to data.table by reference
length(mydata) ## equals the columns assigned
# [1] 2
truelength(mydata) ## total number of column slots allocated
# [1] 100
Let's add 30 more columns the way you did.
setDT(mydata)[, paste0("z", 1:30) := 1L]
length(mydata) ## [1] 32
truelength(mydata) ## [1] 100
And another 30.
setDT(mydata)[, paste0("z", 31:60) := 1L]
length(mydata) ## [1] 62
truelength(mydata) ## [1] 100
And another 30.
setDT(mydata)[, paste0("z", 61:90) := 1L]
length(mydata) ## [1] 92
truelength(mydata) ## [1] 100
Now, the next time we do this, we've to add 30 more columns, but we only have 8 more slots free. So we need to create another object with even more over-allocated slots, assign all columns currently in mydata to the new object, and finally assign it back to mydata. And this is handled internally and automatically so that the user doesn't have to keep track. So the next time we do:
setDT(mydata)[, paste0("z", 91:120) := 1L]
The function [.data.table realises it needs to over-allocate again, and does so, and the new columns get added to the new object. The issue is assigning the result from this new object back to mydata which is in the parent frame of [.data.table. And that is done through assign() statement, which only accepts a variable name as character input, and setDT(mydata) isn't. So the re-assignment step fails and therefore the over-allocation couldn't be reflected back to the original object. If you'd done mydata[, paste0(..) := ...] then the input object mydata is a name, and can be used to assign the over-allocated result back to the original object, and that's why the suggestion from @thelatemail would work.
If this is all too advanced, just upgrade to the devel version, and this'll all go away, and is very unlikely to happen (unless you'd want to have more than 1024 columns in your data.table).
I've filed #1731 to remind us to come back to this and see if there are other ways to get around this case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With