I encountered a weird error message in data.table
I modified a data.table using :=
, and it is totally OK without any error.
When I trying to put the code into a function, the following error message comes out.
Error in `:=`(date, as.Date(as.character(date), "%Y%m%d") - 1) :
:= and `:=`(...) are defined for use in j, once only and in particular ways. See help(":="). Check is.data.table(DT) is TRUE.
Here's reproducible example
testdat <- data.table(ID = c(1:10), date = c(20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101), Number = rnorm(10))
# The single line command works fine.
testdat[, date := as.Date(as.character(date),"%Y%m%d") - 1][, Number:= NULL]
# But if I wrote them into a function, it failed.
# ( In this case, it worked as well.. So I got totally lost. )
test2 <- data.frame(ID = c(1:10), date = c(20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101), Number = rnorm(10))
readdata <- function(fn){
DT <- data.table(fn)
DT[, date:= as.Date(as.character(date),"%Y%m%d") - 1][, Number:= NULL]
return(DT)
}
To better description, I put parts of my original code here. So you may understand where goes wrong.
readdata <- function(fn){
DT <- fread(fn, sep=",")
# DT <- fread("1202.txt")
setnames(DT, paste0("V",c(1:12)), column_names)
# Modification on date
setkey(DT,uid)
DT[,date := as.Date(as.character(date),"%Y%m%d") - 1][, ignore:= NULL] #ignore is the name of one column
...}
I have a list of txt files, and I want to do the calculation for each of them. First step is using fread, and proceed one by one. Suppose now the I want to do the calculation based on "1202.txt" file. If I start from DT <- fread("1202.txt")
and then proceeded. It will not come up this error.
If I want to use readdata("1202.txt")
the error message comes out.
Most weird is that, I used the readdata
before without any errors.
So what's going on here? Any suggestions? Thanks.
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.8.11
loaded via a namespace (and not attached):
[1] tools_3.0.2
EDIT
After some trials, I found that if I modified code as the following, it worked
readdata <- function(fn){
DT <- fread(fn, sep=",")
DT <- data.table(DT) ## Just add this line compared to the original one.
# DT <- fread("1202.txt")
setnames(DT, paste0("V",c(1:12)), column_names)
# Modification on date
setkey(DT,uid)
DT[,date := as.Date(as.character(date),"%Y%m%d") - 1][, ignore:= NULL] #ignore is the name of one column
...}
So the error is due to the fread? After fread, it should be a data.table. Why I need to use data.table(DT) to convert it ?
EDIT
Thanks for attention. Here's an update on Feb 4th, 2014
I first uninstalled my 1.8.11, and followed the instructions of Matt. Install 1.8.10 from CRAN again, and then followed his code step by step. It turns out totally OK without any error.
Then I uninstalled 1.8.11, and then tried to install 1.8.11 again using the precomplied zip file.
As usual, there's a warning message:
> install.packages("~/Desktop/data.table_1.8.11.zip", repos = NULL)
Warning in install.packages :
package ~/Desktop/data.table_1.8.11.zip?is not available (for R version 3.0.2)
Installing package into C:/Users/James/R/win-library/3.0?(as lib?is unspecified)
package data.table?successfully unpacked and MD5 sums checked
> require(data.table)
Loading required package: data.table
data.table 1.8.11 For help type: help("data.table")
It seems that the warning message is wrong, it is totally OK when I loaded the package. And at this time, it is totally OK for the whole process. Thanks for the patience of Matt, and Arun, and all other warmhearted ones. I'm a beginner of data.table. And your kindness is really appreciated.
Here's one more thing, as I have already reported in this link, and still unsolved.
> ?melt.data.table
No documentation for 憁elt.data.table?in specified packages and libraries:
you could try ??melt.data.table?
It's really a pity. Any ideas?
I reported my sessionInfo in that link. And I used Win8.1 64bit
After reinstalling the data.table v1.8.10 / v1.8.11 (I tried for both the two versions), and restarted a new R session. The problem solved.
It turns out my problem was caused by a 5 month old development version being installed.
The data.table
homepage was slightly misleading :
Last recommended snapshot precompiled for Windows: v1.8.11 rev931 04 Sep 2013
The [homepage][1] has been improved and now reads :
install.packages("data.table", repos="http://R-Forge.R-project.org")
Or, if that fails, the last precompiled .zip for Windows copied to this homepage may suffice: v1.8.11 rev1110 04 Feb 2014
Thanks for all of you for valuable answers and comments.
(This is too long for a comment so I put it as an answer). I can't reproduce your error. (Maybe some data.table experts can give you better explanation). This works fine for me:
readdata <- function(fn){
DT <- fread(fn) ## no need to put a sep here, fread guess it
DT[, date:= as.Date(as.character(date),"%Y%m%d") - 1][, Number:= NULL]
return(DT)
}
write.csv(test2,'test2.csv',row.names=F) ## fread works better without rownames
readdata('test2.csv')
ID date
1: 1 2012-12-31
2: 2 2012-12-31
3: 3 2012-12-31
4: 4 2012-12-31
5: 5 2012-12-31
6: 6 2012-12-31
7: 7 2012-12-31
8: 8 2012-12-31
9: 9 2012-12-31
10: 10 2012-12-31
[Edit from Matt] I can't reproduce either. As per comment, here is precisely what I did. How does yours differ?
$ R
R version 3.0.2 (2013-09-25) -- "Frisbee Sailing"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
> require(data.table)
Loading required package: data.table
data.table 1.8.10 For help type: help("data.table")
> test2 <- data.frame(ID = c(1:10), date = c(20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101, 20130101), Number = rnorm(10))
> test2
ID date Number
1 1 20130101 0.26937712
2 2 20130101 0.72113244
3 3 20130101 -0.66086356
4 4 20130101 0.47507096
5 5 20130101 0.69400777
6 6 20130101 -1.26948436
7 7 20130101 1.75919781
8 8 20130101 -0.05306206
9 9 20130101 1.59880358
10 10 20130101 0.69531516
> write.csv(test2,'test2.csv',row.names=FALSE)
> readdata <- function(fn){
+ DT <- fread(fn)
+ DT[, date:= as.Date(as.character(date),"%Y%m%d") - 1][, Number:= NULL]
+ return(DT)
+ }
> readdata("test2.csv")
ID date
1: 1 2012-12-31
2: 2 2012-12-31
3: 3 2012-12-31
4: 4 2012-12-31
5: 5 2012-12-31
6: 6 2012-12-31
7: 7 2012-12-31
8: 8 2012-12-31
9: 9 2012-12-31
10: 10 2012-12-31
>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With