After merging a dataframe with another im left with random NA's for the occasional row. I'd like to set these NA's to 0 so I can perform calculations with them.
Im trying to do this with:
bothbeams.data = within(bothbeams.data, { bothbeams.data$x.x = ifelse(is.na(bothbeams.data$x.x) == TRUE, 0, bothbeams.data$x.x) bothbeams.data$x.y = ifelse(is.na(bothbeams.data$x.y) == TRUE, 0, bothbeams.data$x.y) })
Where $x.x is one column and $x.y is the other of course, but this doesn't seem to work.
To replace NA with 0 in an R data frame, use is.na() function and then select all those values with NA and assign them to 0. myDataframe is the data frame in which you would like replace all NAs with 0.
You can replace NA values with zero(0) on numeric columns of R data frame by using is.na() , replace() , imputeTS::replace() , dplyr::coalesce() , dplyr::mutate_at() , dplyr::mutate_if() , and tidyr::replace_na() functions.
How to replace NA (missing values) with blank space or an empty string in an R dataframe? You can replace NA values with blank space on columns of R dataframe (data. frame) by using is.na() , replace() methods.
You can just use the output of is.na
to replace directly with subsetting:
bothbeams.data[is.na(bothbeams.data)] <- 0
Or with a reproducible example:
dfr <- data.frame(x=c(1:3,NA),y=c(NA,4:6)) dfr[is.na(dfr)] <- 0 dfr x y 1 1 0 2 2 4 3 3 5 4 0 6
However, be careful using this method on a data frame containing factors that also have missing values:
> d <- data.frame(x = c(NA,2,3),y = c("a",NA,"c")) > d[is.na(d)] <- 0 Warning message: In `[<-.factor`(`*tmp*`, thisvar, value = 0) : invalid factor level, NA generated
It "works":
> d x y 1 0 a 2 2 <NA> 3 3 c
...but you likely will want to specifically alter only the numeric columns in this case, rather than the whole data frame. See, eg, the answer below using dplyr::mutate_if
.
A solution using mutate_all
from dplyr
in case you want to add that to your dplyr
pipeline:
library(dplyr) df %>% mutate_all(funs(ifelse(is.na(.), 0, .)))
Result:
A B C 1 0 0 0 2 1 0 0 3 2 0 2 4 3 0 5 5 0 0 2 6 0 0 1 7 1 0 1 8 2 0 5 9 3 0 2 10 0 0 4 11 0 0 3 12 1 0 5 13 2 0 5 14 3 0 0 15 0 0 1
If in any case you only want to replace the NA's in numeric columns, which I assume it might be the case in modeling, you can use mutate_if
:
library(dplyr) df %>% mutate_if(is.numeric, funs(ifelse(is.na(.), 0, .)))
or in base R:
replace(is.na(df), 0)
Result:
A B C 1 0 0 0 2 1 <NA> 0 3 2 0 2 4 3 <NA> 5 5 0 0 2 6 0 <NA> 1 7 1 0 1 8 2 <NA> 5 9 3 0 2 10 0 <NA> 4 11 0 0 3 12 1 <NA> 5 13 2 0 5 14 3 <NA> 0 15 0 0 1
with dplyr 1.0.0
, across
is introduced:
library(dplyr) # Replace `NA` for all columns df %>% mutate(across(everything(), ~ ifelse(is.na(.), 0, .))) # Replace `NA` for numeric columns df %>% mutate(across(where(is.numeric), ~ ifelse(is.na(.), 0, .)))
Data:
set.seed(123) df <- data.frame(A=rep(c(0:3, NA), 3), B=rep(c("0", NA), length.out = 15), C=sample(c(0:5, NA), 15, replace = TRUE))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With