Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

replace NA with 0 using starts_with()

Tags:

r

na

tidyverse

I am trying to replace NA values for a specific set of columns in my tibble. The columns all start with the same prefix so I am wanting to know if there is a concise way to make use of the starts_with() function from the dplyr package that would allow me to do this.

I have seen several other questions on SO, however they all require the use of specific column names or locations. I'm really trying to be lazy and not wanting to define ALL columns, just the prefix.

I've tried the replace_na() function from the tidyr package to no avail. I know the code I have is wrong for the assignment, but my vocabulary isn't large enough to know where to look.

Reprex:

library(tidyverse)

tbl1 <- tibble(
 id = c(1, 2, 3),
 num_a = c(1, NA, 4),
 num_b = c(NA, 99, 100),
 col_c = c("d", "e", NA)
)

replace_na(tbl1, list(starts_with("num_") = 0)))
like image 501
Dan Avatar asked Jun 23 '17 20:06

Dan


People also ask

How to replace NA values with 0 in R?

You can replace NA values with zero(0) on numeric columns of R data frame by using is.na() , replace() , imputeTS::replace() , dplyr::coalesce() , dplyr::mutate_at() , dplyr::mutate_if() , and tidyr::replace_na() functions.

How do I replace with 0 in R?

In this tutorial, we will learn how to replace all NA values in a data frame with zero number in R programming. To replace NA with 0 in an R data frame, use is.na() function and then select all those values with NA and assign them to 0.

How to Change all NA values in a column to 0 in R?

Using the dplyr package in R, you can use the following syntax to replace all NA values with zero in a data frame. Substitute zero for any NA values. To replace NA values in a particular column of a data frame, use the following syntax: In column col1, replace NA values with zero.

How to replace NA values in DataFrame in R?

You can replace NA values with blank space on columns of R dataframe (data. frame) by using is.na() , replace() methods. And use dplyr::mutate_if() to replace only on character columns when you have mixed numeric and character columns, use dplyr::mutate_at() to replace on multiple selected columns by index and name.


1 Answers

How about using mutate_at with if_else (or case_when)? This works if you want to replace all NA in the columns of interest with 0.

mutate_at(tbl1, vars( starts_with("num_") ), 
          funs( if_else( is.na(.), 0, .) ) )

# A tibble: 3 x 4
     id num_a num_b col_c
  <dbl> <dbl> <dbl> <chr>
1     1     1     0     d
2     2     0    99     e
3     3     4   100  <NA>

Note that starts_with and other select helpers return An integer vector giving the position of the matched variables. I always have to keep this in mind when trying to use them in situations outside how I normally use them..

In newer versions of dplyr, use list() with a tilde instead of funs():

list( ~if_else( is.na(.), 0, .) )
like image 102
aosmith Avatar answered Sep 24 '22 22:09

aosmith