Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace NA values in a table for selected columns

There are a lot of posts about replacing NA values. I am aware that one could replace NAs in the following table/frame with the following:

x[is.na(x)]<-0 

But, what if I want to restrict it to only certain columns? Let's me show you an example.

First, let's start with a dataset.

set.seed(1234) x <- data.frame(a=sample(c(1,2,NA), 10, replace=T),                 b=sample(c(1,2,NA), 10, replace=T),                  c=sample(c(1:5,NA), 10, replace=T)) 

Which gives:

    a  b  c 1   1 NA  2 2   2  2  2 3   2  1  1 4   2 NA  1 5  NA  1  2 6   2 NA  5 7   1  1  4 8   1  1 NA 9   2  1  5 10  2  1  1 

Ok, so I only want to restrict the replacement to columns 'a' and 'b'. My attempt was:

x[is.na(x), 1:2]<-0 

and:

x[is.na(x[1:2])]<-0 

Which does not work.

My data.table attempt, where y<-data.table(x), was obviously never going to work:

y[is.na(y[,list(a,b)]), ] 

I want to pass columns inside the is.na argument but that obviously wouldn't work.

I would like to do this in a data.frame and a data.table. My end goal is to recode the 1:2 to 0:1 in 'a' and 'b' while keeping 'c' the way it is, since it is not a logical variable. I have a bunch of columns so I don't want to do it one by one. And, I'd just like to know how to do this.

Do you have any suggestions?

like image 440
jnam27 Avatar asked Oct 15 '13 10:10

jnam27


People also ask

How do I replace Na with 0 in a specific column in R?

How do I replace NA values on a numeric column with 0 (zero) in an R DataFrame (data. frame)? You can replace NA values with zero(0) on numeric columns of R data frame by using is.na() , replace() , imputeTS::replace() , dplyr::coalesce() , dplyr::mutate_at() , dplyr::mutate_if() , and tidyr::replace_na() functions.

How do you replace specific values with Na?

replace_with_na_all() Replaces NA for all variables. replace_with_na_at() Replaces NA on a subset of variables specified with character quotes (e.g., c(“var1”, “var2”)). replace_with_na_if() Replaces NA based on applying an operation on the subset of variables for which a predicate function (is.

How do I drop a row with Na in a specific column?

To drop rows with NA's in some specific columns, you can use the filter() function from the dplyr package and the in.na() function. First, the latter one determines if a value in a column is missing and returns a TRUE or FALSE. Next, the filter function drops all rows with an NA.


1 Answers

You can do:

x[, 1:2][is.na(x[, 1:2])] <- 0 

or better (IMHO), use the variable names:

x[c("a", "b")][is.na(x[c("a", "b")])] <- 0 

In both cases, 1:2 or c("a", "b") can be replaced by a pre-defined vector.

like image 73
flodel Avatar answered Sep 20 '22 03:09

flodel