Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

na.strings applied to a dataframe

Tags:

r

I currently have a dataframe in which there are several rows I would like converted to "NA". When I first imported this dataframe from a .csv, I could use na.strings=c("A", "B", "C) and so on to remove the values I didn't want.

I want to do the same thing again, but this time using a dataframe already, not importing another .csv

To import the data, I used:

data<-read.csv("code.csv", header=T, strip.white=TRUE, stringsAsFactors=FALSE, na.strings=c("", "A", "B", "C"))

Now, with "data", I would like to subset it while removing even more specific values in the rows.. I tried someting like:

data2<-data.frame(data, na.strings=c("D", "E", "F"))

Of course this doesn't work because I think na.strings only works with the "read" package.. not other functions. Is there any equivalent to simply convert certain values into NA so I can na.omit(data2) fairly easily?

Thanks for your help.

like image 946
Alex Petralia Avatar asked Jan 29 '14 04:01

Alex Petralia


2 Answers

Here's a way to replace values in multiple columns:

# an example data frame
dat <- data.frame(x = c("D", "E", "F", "G"), 
                  y = c("A", "B", "C", "D"), 
                  z = c("X", "Y", "Z", "A"))
#   x y z
# 1 D A X
# 2 E B Y
# 3 F C Z
# 4 G D A

# values to replace
na.strings <- c("D", "E", "F")

# index matrix 
idx <- Reduce("|", lapply(na.strings, "==", dat))

# replace values with NA
is.na(dat) <- idx

dat
#     x    y z
# 1 <NA>    A X
# 2 <NA>    B Y
# 3 <NA>    C Z
# 4    G <NA> A
like image 139
Sven Hohenstein Avatar answered Oct 14 '22 05:10

Sven Hohenstein


Just assign the NA values directly.

e.g.:

x <- data.frame(a=1:5, b=letters[1:5])
# > x
#   a b
# 1 1 a
# 2 2 b
# 3 3 c
# 4 4 d
# 5 5 e

# convert the 'b' and 'd' in columb b to NA
x$b[x$b %in% c('b', 'd')] <- NA
# > x
#  a     b
# 1 1    a
# 2 2 <NA>
# 3 3    c
# 4 4 <NA>
# 5 5    e
like image 26
mathematical.coffee Avatar answered Oct 14 '22 03:10

mathematical.coffee