Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

datatable.integer64 argument is not working for me should it?

I am trying to load integer64 as character in fread ?fread indicates that the integer64 argument is not implemented but the options(datatable.integer64) is. Though fread keeps loading as int64.

How can I tell fread to load as character. EDIT [If colClasses is the answer, I think it does not allow to specify a single column name or index and the table I load has tens of columns so unpracticable... => This was WRONG]

Here is a sample

#for int 64
library(bit64)
#for fast everything
library(data.table)

#here is a sample
df <- structure(list(IDFD = structure(c(5.13878419797985e-299, 5.13878419797985e-299, 
+ 5.13878419797985e-299, 5.13878419797987e-299, 5.13878419797987e-299, 
+ 5.13878419797987e-299, 5.13878419797987e-299, 5.13878419797987e-299, 
+ 5.13878419797988e-299, 5.13878419797988e-299), class = "integer64")), .Names = "IDFD", row.names = c(NA, 
+ -10L), class = c("data.table", "data.frame"))
#write the sample to file
write.csv(df,"test.csv",quote=F,row.names=F)

#I can't load it as characters
options(datatable.integer64='character')
str(fread("test.csv",integer64='character'))
Classes ‘data.table’ and 'data.frame':  10 obs. of  1 variable:
 $ IDFD:Class 'integer64'  num [1:10] 5.14e-299 5.14e-299 5.14e-299 5.14e-299 5.14e-299 ...
like image 250
statquant Avatar asked Dec 25 '22 21:12

statquant


1 Answers

This is implemented in v1.8.11, on R-Forge but not yet on CRAN. From NEWS :

o fread's integer64 argument implemented. Allows reading of integer64 data as 'double' or 'character' instead of bit64::integer64 (which remains the default as before). Thanks to Chris Neff for the suggestion. The default can be changed globally; e.g, options(datatable.integer64="character")

Regarding :

If colClasses is the answer, I think it does not allow to specify a single column name or index and the table I load has tens of columns so unpracticable...

colClasses in fread does let you override type for one or a few columns (by name or by number), and the rest will be automatically detected. For exactly the reason you state. If it doesn't, please report as a bug. An alternative to colClasses is the datatable.integer64 global option which lets you tell fread that whenever it detects integer64 it should load it as character or double instead (in v1.8.11 as well).

like image 56
Matt Dowle Avatar answered Jan 30 '23 21:01

Matt Dowle