Windows 8.1, R version 3.1.1 (2014-07-10), System x86_64, mingw32
I've got a file with a lot of observations (here). Here are some strings from the file
Date;Time;Global_active_power;Global_reactive_power;Voltage;Global_intensity;Sub_metering_1;Sub_metering_2;Sub_metering_3
16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000
16/12/2006;17:25:00;5.360;0.436;233.630;23.000;0.000;1.000;16.000
28/4/2007;00:20:00;0.492;0.208;236.240;2.200;0.000;0.000;0.000
28/4/2007;00:21:00;?;?;?;?;?;?;
21/12/2006;11:25:00;0.246;0.000;241.740;1.000;0.000;0.000;0.000
21/12/2006;11:26:00;0.246;0.000;241.830;1.000;0.000;0.000;0.000
The NA values are represented by "?". I'm trying to read the file with
epcData <- fread(dataFile,
sep = ";",
header = TRUE,
na.strings = "?",
colClasses = c("character", "character", rep("numeric", 7)),
stringsAsFactors = FALSE)
I've got warnings like:
Bumped column 3 to type character on data row 10, field contains '?'. Coercing previously read values in this column from integer or numeric back to character which may not be lossless; e.g., if '00' and '000' occurred before they will now be just '0', and there may be inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in this column before the bump). If this matters please rerun and set 'colClasses' to 'character' for this column. Please note that column type detection uses the first 5 rows, the middle 5 rows and the last 5 rows, so hopefully this message should be very rare. If reporting to datatable-help, please rerun and include the output from verbose=TRUE.
The row 10 is
28/4/2007;00:21:00;?;?;?;?;?;?;
epcData[10]
prints
Date Time Global_active_power Global_reactive_power Voltage
1: 28/4/2076 00:21:00 NA NA NA
Global_intensity Sub_metering_1 Sub_metering_2 Sub_metering_3
1: NA NA NA NA
But the modes of all columns are "character" even for columns 3:9 (but colClasses = c("character", "character", rep("numeric", 7))).
What is going wrong?
As of today with version 1.12.2 of the data.table
package. This is no longer an issue and the import of the above csv data works flawlessly and all the question marks are replaced by NA
s
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With