Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct syntax for mutate_if

Tags:

r

na

dplyr

I would like to replace NA values with zeros via mutate_if in dplyr. The syntax below:

set.seed(1) mtcars[sample(1:dim(mtcars)[1], 5),        sample(1:dim(mtcars)[2], 5)] <-  NA  require(dplyr)  mtcars %>%      mutate_if(is.na,0)  mtcars %>%      mutate_if(is.na, funs(. = 0)) 

Returns error:

Error in vapply(tbl, p, logical(1), ...) : values must be length 1, but FUN(X[[1]]) result is length 32

What's the correct syntax for this operation?

like image 831
Konrad Avatar asked Feb 05 '17 12:02

Konrad


People also ask

What is Mutate_if?

Eg mutate_if(data, is. numeric, ...) means to carry out a transformation on all numeric columns in your dataset. If you want to replace all NAs with zeros in numeric columns: data %>% mutate_if(is. numeric, funs(ifelse(is.na(.), 0, .)))

What does Mutate_all do in R?

mutate() – adds new variables while retaining old variables to a data frame. transmute() – adds new variables and removes old ones from a data frame. mutate_all() – changes every variable in a data frame simultaneously. mutate_at() – changes certain variables by name.


2 Answers

The "if" in mutate_if refers to choosing columns, not rows. Eg mutate_if(data, is.numeric, ...) means to carry out a transformation on all numeric columns in your dataset.

If you want to replace all NAs with zeros in numeric columns:

data %>% mutate_if(is.numeric, funs(ifelse(is.na(.), 0, .))) 
like image 95
Hong Ooi Avatar answered Oct 06 '22 01:10

Hong Ooi


I learned this trick from the purrr tutorial, and it also works in dplyr. There are two ways to solve this problem:
First, define custom functions outside the pipe, and use it in mutate_if():

any_column_NA <- function(x){     any(is.na(x)) } replace_NA_0 <- function(x){     if_else(is.na(x),0,x) } mtcars %>% mutate_if(any_column_NA,replace_NA_0) 

Second, use the combination of ~,. or .x.( .x can be replaced with ., but not any other character or symbol):

mtcars %>% mutate_if(~ any(is.na(.x)),~ if_else(is.na(.x),0,.x)) #This also works mtcars %>% mutate_if(~ any(is.na(.)),~ if_else(is.na(.),0,.)) 

In your case, you can also use mutate_all():

mtcars %>% mutate_all(~ if_else(is.na(.x),0,.x)) 

Using ~, we can define an anonymous function, while .x or . stands for the variable. In mutate_if() case, . or .x is each column.

like image 31
yusuzech Avatar answered Oct 06 '22 01:10

yusuzech