I have data in a csv containing long integers. I am exchanging this data between csvs and fst
files.
For example,
library(bit64)
library(data.table)
library(fst)
library(magrittr)
# Prepare example csvs
DT64_orig <- data.table(x = (c(2345612345679, 1234567890, 8714567890)))
fwrite(DT64_orig, "DT64_orig.csv")
# Read and move to fst
DT64 <- fread("DT64_orig.csv")
write.fst(DT64, "DT64_fst.fst")
DT_fst2 <-
read.fst("DT64_fst.fst") %>%
setDT
# bit64 integers not preserved:
identical(DT_fst2, DT64)
Is there a way to use fst
files for data.table
s containing bit64
integers
It looks like fst
might be dropping column attributes either when saving or loading (please ask as an issue on fst
package). You can put the column types back yourself in the meantime. bit64::integer64
is a plain double
under the hood so no bits have been lost. Just the type information telling R how to print the column.
> DT_fst2
x
1: 1.158886e-311
2: 6.099576e-315
3: 4.305569e-314
> setattr(DT_fst2$x, "class", "integer64")
> DT_fst2
x
1: 2345612345679
2: 1234567890
3: 8714567890
> identical(DT_fst2, DT64)
[1] TRUE
Matt is absolutely right, fst
is currently not serializing any column attributes. It will in the next version though, which is due in a few weeks. At that point, also classes such as Date
and POSIXt
will be supported. Supporting custom attributes will be a challenge however, because fst
provides random access to the data and some attributes are modified upon sub-setting (think time series for example).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With