Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: read.csv importing the letter i as NA

Tags:

r

csv

na

Pretty simple question (I think). I'm trying to import a .csv file into R, from an experiment in which people respond by either pushing the "e" or the "i" key. In testing it, I responded only in with the "i" key, so the response variable in the data set is basically a list of "i"s (without the quotation marks). When I try and import the data into R:

noload=read.csv("~/Desktop/eprime check no load.csv", na.strings = "")

the response variable comes out all NAs. When I try it with all "e"s, or a mixture of "e" and "i", it works fine.

What is is about the letter i that makes R treat it as NA (n.b. it does this even without the na.strings = "" part)?

Thanks in advance for any help.

like image 645
cmpsych93 Avatar asked May 03 '16 20:05

cmpsych93


People also ask

What does the read csv () function in R do?

csv() Function. read. csv() function in R Language is used to read “comma separated value” files. It imports data in the form of a data frame.

How do I import a specific column into a csv file in R?

Method 1: Using read. table() function. In this method of only importing the selected columns of the CSV file data, the user needs to call the read. table() function, which is an in-built function of R programming language, and then passes the selected column in its arguments to import particular columns from the data.

What are NA strings R?

NA is used for all kinds of missing data: In other packages, missing strings and missing numbers might be represented differently–empty quotations for strings, periods for numbers. In R, NA represents all types of missing data.


1 Answers

When you ask R to read in a table without specifying data types for the columns, it will try to "guess" the data types. In this case, it guesses "complex" for the data type. For example, if you had datafile.csv with contents

Var
i
i
i

and you do:

df = read.csv("datafile.csv", header = TRUE, na.strings = "")
class(df$Var)

you'll get

[1] "complex"

R interprets the i as the purely imaginary value. To fix this simply specify the data types with colClass, like so:

df = read.csv("datafile.csv", header = TRUE, na.strings = "", colClass = "factor")

or replace factor with whatever you want. It's good practice usually to specify data types up front like this so you don't run into confusing errors later.

like image 52
Jason Avatar answered Sep 25 '22 04:09

Jason