Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace all NA with FALSE in selected columns in R

Tags:

I have a question similar to this one, but my dataset is a bit bigger: 50 columns with 1 column as UID and other columns carrying either TRUE or NA, I want to change all the NA to FALSE, but I don't want to use explicit loop.

Can plyr do the trick? Thanks.

UPDATE #1

Thanks for quick reply, but what if my dataset is like below:

df <- data.frame(   id = c(rep(1:19),NA),   x1 = sample(c(NA,TRUE), 20, replace = TRUE),   x2 = sample(c(NA,TRUE), 20, replace = TRUE) ) 

I only want X1 and X2 to be processed, how can this be done?

like image 756
lokheart Avatar asked Sep 02 '11 03:09

lokheart


People also ask

How do I replace Na with 0 in a specific column in R?

Use R dplyr::coalesce() to replace NA with 0 on multiple dataframe columns by column name and dplyr::mutate_at() method to replace by column name and index. tidyr:replace_na() to replace. Using these methods and packages you can also replace NA with an empty string in R dataframe.

How do I exclude all NA in R?

Data Visualization using R Programming To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).

How do I remove Na from all columns in R?

To remove observations with missing values in at least one column, you can use the na. omit() function. The na. omit() function in the R language inspects all columns from a data frame and drops rows that have NA's in one or more columns.

How do I replace missing values in a column in R?

To replace missing values in R with the minimum, you can use the tidyverse package. Firstly, you use the mutate() function to specify the column in which you want to replace the missing values. Secondly, you call the replace() function to identify the NA's and to substitute them with the column lowest value.


2 Answers

If you want to do the replacement for a subset of variables, you can still use the is.na(*) <- trick, as follows:

df[c("x1", "x2")][is.na(df[c("x1", "x2")])] <- FALSE 

IMO using temporary variables makes the logic easier to follow:

vars.to.replace <- c("x1", "x2") df2 <- df[vars.to.replace] df2[is.na(df2)] <- FALSE df[vars.to.replace] <- df2 
like image 179
Hong Ooi Avatar answered Sep 21 '22 21:09

Hong Ooi


tidyr::replace_na excellent function.

df %>%   replace_na(list(x1 = FALSE, x2 = FALSE)) 

This is such a great quick fix. the only trick is you make a list of the columns you want to change.

like image 25
mtelesha Avatar answered Sep 22 '22 21:09

mtelesha