Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error in read.csv with colClasses: scan() expected 'a real' got 'NULL'

Tags:

import

r

csv

I am reading in large csv file using read.csv. Several websites suggest using colClasses to define the classes for each column to make the import process faster.

t = read.csv("pca.csv",header=TRUE,colClasses = classes)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
scan() expected 'a real', got 'NULL'

classes = c("numeric","integer")

I obviously have nulls in some of my data. Is there a way to use colClasses where "numeric" or "integer" include nulls? Also, any other tips on importing large datasets faster into R would be very helpful. I have all the data in a SQL database and I've tried using RODBC which is surprisingly slower than read.csv().

like image 365
elfty Avatar asked Jun 19 '12 20:06

elfty


1 Answers

Use na.strings='NULL' in your call to read.csv.

like image 51
Matthew Plourde Avatar answered Oct 19 '22 22:10

Matthew Plourde