Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace NA with 0 in a data frame column [duplicate]

Tags:

dataframe

r

na

Possible Duplicate:
Set NA to 0 in R

I have a data.frame with a column having NA values. I want to replace NA with 0 or any other value. I have tried a lot of threads and methods but it did not give me the result. I have tried the below methods.

a$x[a$x == NA] <- 0; a[ , c("x")] <- apply(a[ , c("x")], 1, function(z){replace(z, is.na(z), 0)}); a$x[is.na(a$x), ] <- 0; 

None of the above methods replaced NA with 0 in column x for data.frame a. Why?

like image 899
Kunal Batra Avatar asked Nov 01 '12 07:11

Kunal Batra


People also ask

How do I replace Na with 0 in a column?

How do I replace NA values on a numeric column with 0 (zero) in an R DataFrame (data. frame)? You can replace NA values with zero(0) on numeric columns of R data frame by using is.na() , replace() , imputeTS::replace() , dplyr::coalesce() , dplyr::mutate_at() , dplyr::mutate_if() , and tidyr::replace_na() functions.

How do you replace missing values in a column with 0 in R?

The easiest and most versatile way to replace NA's with zeros in R is by using the REPLACE_NA() function. The REPLACE_NA() function is part of the tidyr package, takes a vector, column, or data frame as input, and replaces the missing values with a zero.

How do I change NA values in a column?

The easiest way to replace NA's with the mean in multiple columns is by using the functions mutate_at() and vars(). These functions let you select the columns in which you want to replace the missing values. To actually replace the NA with the mean, you can use the replace_na() and mean() function.

How do I get rid of Na in a data frame?

To remove all rows having NA, we can use na. omit function. For Example, if we have a data frame called df that contains some NA values then we can remove all rows that contains at least one NA by using the command na. omit(df).

How do I replace Na with 0 in a Dataframe?

Feb 22 '13 at 1:25 8 Suppose you only want to replace NA with 0 in columns 4-6 of a data frame named my.df. You can use: my.df[,4:6][is.na(my.df[,4:6])] <- 0

How to replace NaN values with zeros in pandas Dataframe?

Depending on the scenario, you may use either of the 4 methods below in order to replace NaN values with zeros in Pandas DataFrame: (1) For a single column using Pandas: df ['DataFrame Column'] = df ['DataFrame Column'].fillna (0) (2) For a single column using NumPy:

How to replace Na based on column type in a table?

To replace NAbased on column type you can use a purrr-like formula in where: df %>% mutate(across(where(~ anyNA(.) & is.character(.)), ~ replace_na(., "0")))

How to replace Na with 0 in an R vector?

Insert Zeros for NA Values in an R Vector (or Column) As you have seen in the previous examples, R replaces NA with 0 in multiple columns with only one line of code. However, we need to replace only a vector or a single column of our database. Let’s find out how this works. First, create some example vector with missing values.


2 Answers

Since nobody so far felt fit to point out why what you're trying doesn't work:

  1. NA == NA doesn't return TRUE, it returns NA (since comparing to undefined values should yield an undefined result).
  2. You're trying to call apply on an atomic vector. You can't use apply to loop over the elements in a column.
  3. Your subscripts are off - you're trying to give two indices into a$x, which is just the column (an atomic vector).

I'd fix up 3. to get to a$x[is.na(a$x)] <- 0

like image 114
themel Avatar answered Sep 22 '22 19:09

themel


First, here's some sample data:

set.seed(1) dat <- data.frame(one = rnorm(15),                  two = sample(LETTERS, 15),                  three = rnorm(15),                  four = runif(15)) dat <- data.frame(lapply(dat, function(x) { x[sample(15, 5)] <- NA; x })) head(dat) #          one  two       three      four # 1         NA    M  0.80418951 0.8921983 # 2  0.1836433    O -0.05710677        NA # 3 -0.8356286    L  0.50360797 0.3899895 # 4         NA    E          NA        NA # 5  0.3295078    S          NA 0.9606180 # 6 -0.8204684 <NA> -1.28459935 0.4346595 

Here's our replacement:

dat[["four"]][is.na(dat[["four"]])] <- 0 head(dat) #          one  two       three      four # 1         NA    M  0.80418951 0.8921983 # 2  0.1836433    O -0.05710677 0.0000000 # 3 -0.8356286    L  0.50360797 0.3899895 # 4         NA    E          NA 0.0000000 # 5  0.3295078    S          NA 0.9606180 # 6 -0.8204684 <NA> -1.28459935 0.4346595 

Alternatively, you can, of course, write dat$four[is.na(dat$four)] <- 0

like image 37
A5C1D2H2I1M1N2O1R2T1 Avatar answered Sep 19 '22 19:09

A5C1D2H2I1M1N2O1R2T1