Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using read_excel(na = ) how do you specify more than one NA character string?

Tags:

I'm trying to read into R an excel sheet that uses multiple values for NA (specifically, "N/A" and "n/a"). If I try to give na= a list of strings it throws an error:

read_excel(path = "file.xlsx",
           na = "N/A") #This works just fine

read_excel(path = "file.xlsx",
           na = c("N/A", "n/a"))

Error in eval(substitute(expr), envir, enclos) : expecting a single value

Any ideas on how to read this in with both strings converted to NA? Or am I better off doing a find/replace once the data is in R?

like image 568
Watanake Avatar asked Mar 06 '17 20:03

Watanake


1 Answers

As you gathered, read_excel does not accept more than one value. Consider using gdata::read.xls instead.

gdata::read.xls("file.xlsx", na.strings = c("N/A", "n/a"))

Edit: Note that you need to have perl installed to run this. If you're on windows you may need to specify something like perl="C:/Perl/bin/perl.exe" in the call to read.xls.

Edit 2: As @r2evans suggested in the comments, the development version of readxl supports multiple na values:

devtools::install_github("tidyverse/readxl")
readxl::read_excel(path = "file.xlsx", na = c("N/A", "n/a"))
like image 108
Johan Larsson Avatar answered Sep 23 '22 10:09

Johan Larsson