Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to assign column names with fread in R?

Tags:

r

data.table

I have the following code -

zz3 <- 'data,key
"VA1,VA2,20140524,,0,0,5969,20140523134902,S7,S1147,140,20140523134902,m/t",4503632376496128
"VA2,VA3,20140711,,0,0,8824,20140601095714,S1,S6402,175,20140601095839,m/t",4503643113914368
"VA1,VA3,20140710,,0,0,11678,20140604085203,S1,S1430,250,20140604085329,m/t",4503666467799040
"VA2,VA1,20140724,,0,0,7109,20140523133835,S7,S793,130,20140523133835,m/t",4503679218483200
"VA3,VA1,20140925,,0,0,10592,20140604092548,S7,S109,395,20140604092714,m/t",4503694653521920'

columnClasses <- c("or"="factor", "d"="factor", "ddate"="factor", "rdate"="factor", "changes"="integer", "class"="factor", "price"="integer", "fdate"="factor", "company"="factor", "number"="factor", "dur"="integer", "added"="factor", "source"="factor", "key"="NULL") # skip last column "key"
data <- fread(zz3, header = FALSE, sep = ",", skip = 1, na.strings = c(""), colClasses = columnClasses)

But it returns an error -

Error in fread(zz3, header = FALSE, sep = ",", skip = 1, na.strings = c(""),  : 
  Column name 'or' in colClasses[[1]] not found

I expected that colClasses assigns column names, when header = FALSE, but looks like it is not the case.

How should I fix that? Similar read.csv code worked well.

like image 998
LA_ Avatar asked Apr 01 '15 15:04

LA_


2 Answers

It is indeed not the case.

colClasses enables you to define to column types by using fread. Suppose you have file splitted by | with a column named 'key' and you want it to be a character, you will run the command: fread(filePath, sep='|', colClasses=c(key='character')).

If you have no names in the file you can use setnames to assign column names to your data.table once it is read.

like image 103
Colonel Beauvel Avatar answered Oct 10 '22 15:10

Colonel Beauvel


You should be separating it into column names and column classes

Setting the column names should be done in a separate step.

column_names <-c("or", "d", "ddate", "rdate", "changes", "class", "price", "fdate", "company", "number", "dur", "added", "source", "key") 
column_classes <- c("factor", "factor", "factor", "factor", "integer", "factor", "integer", "factor", "factor", "factor", "integer", "factor", "factor", "NULL") 

data <- fread(zz3, header = FALSE, sep = ",", skip = 1, na.strings = c(""), colClasses = column_classes)
setnames(data, column_names)
like image 31
Michal Avatar answered Oct 10 '22 14:10

Michal