Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

data.table::fread an `integer64` type, manually override colClass for only one column

Tags:

r

data.table

I have a .csv where a column of IDs contains a long integer with leading zeros. fread converts it into an integer64 type. How would I specify the class for one column and then just let fread automatically guess the classes for the remaining columns? Not sure if this is an "all-or-nothing" type of situation.

I have 50+ columns and would rather not have to specify the data types for all of them just because I have to do so for one of them.

My question is related to: R fread - read all columns as character.

like image 569
user2205916 Avatar asked Dec 18 '22 01:12

user2205916


2 Answers

From ?fread:

# colClasses
data = "A,B,C,D\n1,3,5,7\n2,4,6,8\n"
fread(data, colClasses=c(B="character",C="character",D="character"))  # as read.csv
fread(data, colClasses=list(character=c("B","C","D")))    # saves typing
fread(data, colClasses=list(character=2:4))     # same using column numbers

That is, if your zero-padded column is called big_num, just use colClasses = list(character = 'big_num')

like image 112
MichaelChirico Avatar answered Jan 17 '23 17:01

MichaelChirico


Addressing the auto detection and overriding a specific column:

# Auto detect the column types (special case of using nrows=0)
colCls <- sapply(fread(fName, nrows=0), class)
# Override the "wrong" detected column types
colCls[c("field1", "field2")] <- "character"
dt<-fread(fName, colClasses = colCls)
like image 39
Yishai Brown Avatar answered Jan 17 '23 16:01

Yishai Brown