Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace NA with 0, only in numeric columns in data.table

I have a data.table with columns of different data types. My goal is to select only numeric columns and replace NA values within these columns by 0. I am aware that replacing na-values with zero goes like this:

DT[is.na(DT)] <- 0

To select only numeric columns, I found this solution, which works fine:

DT[, as.numeric(which(sapply(DT,is.numeric))), with = FALSE]

I can achieve what I want by assigning

DT2 <- DT[, as.numeric(which(sapply(DT,is.numeric))), with = FALSE]

and then do:

DT2[is.na(DT2)] <- 0

But of course I would like to have my original DT modified by reference. With the following, however:

DT[, as.numeric(which(sapply(DT,is.numeric))), with = FALSE]
                 [is.na(DT[, as.numeric(which(sapply(DT,is.numeric))), with = FALSE])]<- 0

I get

"Error in [.data.table([...] i is invalid type (matrix)"

What am I missing? Any help is much appreciated!!

like image 859
HannesZ Avatar asked May 23 '16 12:05

HannesZ


People also ask

How do I replace all Na in a Dataframe with 0 in R?

To replace NA with 0 in an R data frame, use is.na() function and then select all those values with NA and assign them to 0.

How do I replace Na in R?

You can replace NA values with blank space on columns of R dataframe (data. frame) by using is.na() , replace() methods. And use dplyr::mutate_if() to replace only on character columns when you have mixed numeric and character columns, use dplyr::mutate_at() to replace on multiple selected columns by index and name.

How do I replace NAS with zeros?

Replace NA's with Zeros using the REPLACE_NA() Function. The easiest and most versatile way to replace NA's with zeros in R is by using the REPLACE_NA() function. The REPLACE_NA() function is part of the tidyr package, takes a vector, column, or data frame as input, and replaces the missing values with a zero.


1 Answers

You need tidyverse purrr function map_if along with ifelse to do the job in a single line of code.

library(tidyverse)
set.seed(24)
DT <- data.table(v1= sample(c(1:3,NA),20,replace = T), v2 = sample(c(LETTERS[1:3],NA),20,replace = T), v3=sample(c(1:3,NA),20,replace = T))

Below single line code takes a DT with numeric and non numeric columns and operates just on the numeric columns to replace the NAs to 0:

DT %>% map_if(is.numeric,~ifelse(is.na(.x),0,.x)) %>% as.data.table

So, tidyverse can be less verbose than data.table sometimes :-)

like image 164
Lazarus Thurston Avatar answered Oct 25 '22 04:10

Lazarus Thurston