How can I get fread() to set "" to a NA for all variables including character variables?
I am importing a .csv file where missing values are empty strings (""; no space). I want "" to be interpreted as missing value NA and tried `na.strings = "" without success:
data <- fread("file.csv", na.strings = "")
unique(data$character_variable)
# [1] "abc" "def" ""
On the other hand, when I use read.csv with na.strings = "", the "" are turned into NAs, even for character variables. This is the result I want.
data <- read.csv("file.csv", na.strings = "")
unique(data$character_variable)
# [1] "abc" "def" NA
versions
Well, you can't if your csv file looks like this
a,b
x,y
"",1
Note that whatever inside the "" is treated as a string literal because "" are the escape characters. In that sense, ,"", in a csv file just means an empty string, but not a missing value (i.e. ,,). I would consider this a good feature for consistency. This is also written in the section na.strings of the documentation of fread:
A character vector of strings which are to be interpreted as
NAvalues. By default,",,"for columns of all types, including typecharacteris read asNAfor consistency.,"",is unambiguous and read as an empty string. To read,NA,asNA, setna.strings="NA". To read,,as blank string"", setna.strings=NULL. When they occur in the file, the strings inna.stringsshould not appear quoted since that is how the string literal,"NA",is distinguished from,NA,, for example, whenna.strings="NA".
On the other hand, you may notice that if the file looks like this
a,b
1,y
"",1
, then the empty string will be converted into NA. However, I think it's not a bug because this behaviour is probably a consequence of type coercion by the parser. In the Details section of the same document, you can see that
The lowest type for each column is chosen from the ordered list:
logical,integer,integer64,double,character.
So column a is first read as a character column and later converted into an integer one. The empty string is still read as is but coerced into an NA_integer_ in the second step.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With