I am reading in large csv file using read.csv
. Several websites suggest using colClasses to define the classes for each column to make the import process faster.
t = read.csv("pca.csv",header=TRUE,colClasses = classes)
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
scan() expected 'a real', got 'NULL'
classes = c("numeric","integer")
I obviously have nulls in some of my data. Is there a way to use colClasses where "numeric" or "integer" include nulls? Also, any other tips on importing large datasets faster into R would be very helpful. I have all the data in a SQL database and I've tried using RODBC which is surprisingly slower than read.csv().
Use na.strings='NULL'
in your call to read.csv
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With