R: read.csv importing the letter i as NA





Pretty simple question (I think). I'm trying to import a .csv file into R, from an experiment in which people respond by either pushing the "e" or the "i" key. In testing it, I responded only in with the "i" key, so the response variable in the data set is basically a list of "i"s (without the quotation marks). When I try and import the data into R:

noload=read.csv("~/Desktop/eprime check no load.csv", na.strings = "")

the response variable comes out all NAs. When I try it with all "e"s, or a mixture of "e" and "i", it works fine.

What is is about the letter i that makes R treat it as NA (n.b. it does this even without the na.strings = "" part)?

Thanks in advance for any help.

1 Answers

When you ask R to read in a table without specifying data types for the columns, it will try to "guess" the data types. In this case, it guesses "complex" for the data type. For example, if you had datafile.csv with contents


and you do:

df = read.csv("datafile.csv", header = TRUE, na.strings = "")

you'll get

[1] "complex"

R interprets the i as the purely imaginary value. To fix this simply specify the data types with colClass, like so:

df = read.csv("datafile.csv", header = TRUE, na.strings = "", colClass = "factor")

or replace factor with whatever you want. It's good practice usually to specify data types up front like this so you don't run into confusing errors later.

