Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to avoid reading dates as IDate in fread

Tags:

r

data.table

The same question is asked here without a solution, How to avoid fread() importing date info as IDate?

The old question was not very specific and was mixing some other issue alongside, Also it did not contain a reprex that I present here. Hence I am improving the question. Please donot mark it as duplicate :-)

The question is: Whenever I read a csv file containing a date column using data.table:: fread it changes the class of the Date column to IDate. How do I avoid this and keep it as Date format?

library(data.table)
library(magrittr)
dt <- data.table(datecol = seq(Sys.Date(),by = "1 day",length.out = 3))
# let's confirm the format of the column is Date
str(dt)
#> Classes 'data.table' and 'data.frame':   3 obs. of  1 variable:
#>  $ datecol: Date, format: "2022-05-10" "2022-05-11" ...
#>  - attr(*, ".internal.selfref")=<externalptr>
# Now we write it into a file and read back using fwrite and fread
fwrite(dt,"tmpoutput.csv")
fread("tmpoutput.csv") %>% str
#> Classes 'data.table' and 'data.frame':   3 obs. of  1 variable:
#>  $ datecol: IDate, format: "2022-05-10" "2022-05-11" ...
#>  - attr(*, ".internal.selfref")=<externalptr>
# as you see the date format changes to IDate

Created on 2022-05-10 by the reprex package (v2.0.1)

This is not a huge problem but it needs one extra line of code everytime after a file is read i.e. dt[,datecol:=as_date(datecol = as_date(datecol)] so that rbind with a similar DT does not fail.

Is there a simpler way to avoid this, as it is a potential bug reason if we forget to do a type conversion later?

like image 889
Lazarus Thurston Avatar asked Sep 13 '25 08:09

Lazarus Thurston


2 Answers

Building on @Wimpel answer, you can simply specify the columns classes using the colClasses argument :

fread("tmpoutput.csv",colClasses=c(datecol='Date')) %>% str

Classes ‘data.table’ and 'data.frame':  3 obs. of  1 variable:
 $ datecol: Date, format: "2022-05-10" "2022-05-11" "2022-05-12"
 - attr(*, ".internal.selfref")=<externalptr> 
like image 108
Waldi Avatar answered Sep 14 '25 23:09

Waldi


you can create a new class (here: importDate), and refer to that in colClasses argument in fread. This forces the given columns to be read in as Date (and not the default iDate).

setClass("importDate")
# conversion
setAs("character", "importDate", function(from) as.Date(from))
# Now read, use a named vector in colClasses, so only identify the cols you explicitly want to convert to Date
fread("tmpoutput.csv", colClasses = c(datecol = "importDate")) %>% str
# Classes ‘data.table’ and 'data.frame':    3 obs. of  1 variable:
#   $ datecol: Date, format: "2022-05-10" "2022-05-11" "2022-05-12"
# - attr(*, ".internal.selfref")=<externalptr>
like image 25
Wimpel Avatar answered Sep 14 '25 23:09

Wimpel