Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the best way to replace missing values with NA when reading in a .csv?

Tags:

r

csv

na

I have a .csv dataset with many missing values, and I'd like R to recognize them all the same way (the "correct" way) when I read the table in. I've been using:

import = read.csv("/Users/dataset.csv", 
                  header =T, na.strings=c(""))

This script fills all the empty cells with something, but it's not consistant. When I look at the data with head(import), some missing cells are filled with <NA> and some missing cells are filled with NA. I fear that R treats these two ways of identifying missing values differently when start analyzing the dataset, so I'd like to have the import uniformly read in those missing values.

Finally, some of the missing values in my csv file are represented with a period only. I would also like those periods to be represented by the correct missing value notation when I import to R.

like image 610
Luke Avatar asked Dec 11 '12 15:12

Luke


People also ask

How do I replace Na with missing values?

dplyr::na_if() to replace specified values with NA s; dplyr::coalesce() to replaces NA s with values from other vectors.

What is a good way to fill in missing values in a dataset?

Use the fillna() Method The fillna() function iterates through your dataset and fills all empty rows with a specified value. This could be the mean, median, modal, or any other value.

How do I replace specific values with NA in R?

Using R replace() function to update 0 with NA R has a built-in function called replace() that replaces values in a vector with another value, for example, zeros with NAs.


1 Answers

The <NA> vs NA just means that some of your columns are character and some are numeric, that's all. Absolutely nothing is wrong with that.

As Ben mentioned above, if some of your missing values in the csv are represented by a single period, ., then you can specify a vector of values that should be treated as NAs via:

na.strings=c("",".","NA")

as an argument to read.csv.

like image 170
joran Avatar answered Oct 22 '22 10:10

joran