Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace all NA in a dataframe using tidyr::replace_na? [duplicate]

Tags:

r

dplyr

tidyr

I'm trying to fill all NAs in my data with 0's. Does anyone know how to do that using replace_na from tidyr? From documentation, we can easily replace NA's in different columns with different values. But how to replace all of them with some value? I have many columns...

Using mtcars dataset as an example:

mtcars [sample(1:nrow(mtcars), 4), sample(1:ncol(mtcars), 4)]<- NA mtcars %>% replace_na( ??? ) 
like image 828
zesla Avatar asked Aug 08 '17 19:08

zesla


People also ask

How do I replace all NA values in a Dataframe in R?

To replace NA with 0 in an R data frame, use is.na() function and then select all those values with NA and assign them to 0. myDataframe is the data frame in which you would like replace all NAs with 0.

How do I replace my Dplyr na?

You can replace NA values with zero(0) on numeric columns of R data frame by using is.na() , replace() , imputeTS::replace() , dplyr::coalesce() , dplyr::mutate_at() , dplyr::mutate_if() , and tidyr::replace_na() functions.

How do I replace Na in R?

The classic way to replace NA's in R is by using the IS.NA() function. The IS.NA() function takes a vector or data frame as input and returns a logical object that indicates whether a value is missing (TRUE or VALUE). Next, you can use this logical object to create a subset of the missing values and assign them a zero.

How do I replace Na with blank?

How to replace NA (missing values) with blank space or an empty string in an R dataframe? You can replace NA values with blank space on columns of R dataframe (data. frame) by using is.na() , replace() methods.


2 Answers

If replace_na is not a mandatory requirement, following code will work:

mtcars %>% replace(is.na(.), 0) 

Reference Issue: https://stackoverflow.com/a/45574804/8382207

like image 192
Sagar Avatar answered Oct 03 '22 11:10

Sagar


I found a way to get it working with replace_na as requested (as it is the fastest option via microbenchmark testing):

UPDATE with dplyr v1.0.0

This has been made much easier with addition of the dplyr::across function:

library(dplyr) library(tidyr)  mtcars %>%    mutate(     across(everything(), ~replace_na(.x, 0))   )  # Or if you're pipe shy: mutate(mtcars, across(everything(), ~replace_na(.x, 0))) 

That's it! Pretty simple stuff.

For dplyr < v1.0.0

library(tidyr) library(dplyr)  # First, create a list of all column names and set to 0 myList <- setNames(lapply(vector("list", ncol(mtcars)), function(x) x <- 0), names(mtcars))  # Now use that list in tidyr::replace_na  mtcars %>% replace_na(myList) 

To apply this to your working data frame, be sure to replace the 2 instances of mtcars with whatever you have named your working data frame when creating the myList object.

like image 31
Dave Gruenewald Avatar answered Oct 03 '22 09:10

Dave Gruenewald