Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace NAs with missing values in sequence (R)

I have a DF like

enter image description here

Now I want to replace The Col B = NA with 15 since that is the missing value. Col C first NA with 14 and second NA with 15. Col D first NA with 13, second NA with 14 and third NA with 15. So the numbers follow a sequence up to down or down to up.

Reproducible Sample Data

structure(list(`Col A` = c(11, 12, 13, 14, 15), `Col B` = c(NA, 
11, 12, 13, 14), `Col C` = c(NA, NA, 11, 12, 13), `Col D` = c(NA, 
NA, NA, 11, 12)), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))
like image 289
Kamakshi Bagga Avatar asked Aug 06 '21 03:08

Kamakshi Bagga


People also ask

How do I replace Na with missing in R?

The classic way to replace NA's in R is by using the IS.NA() function. The IS.NA() function takes a vector or data frame as input and returns a logical object that indicates whether a value is missing (TRUE or VALUE). Next, you can use this logical object to create a subset of the missing values and assign them a zero.

How do I replace all NA values in a Dataframe in R?

To replace NA with 0 in an R data frame, use is.na() function and then select all those values with NA and assign them to 0. myDataframe is the data frame in which you would like replace all NAs with 0.

How do I exclude data with NA in R?

First, if we want to exclude missing values from mathematical operations use the na. rm = TRUE argument. If you do not exclude these values most functions will return an NA . We may also desire to subset our data to obtain complete observations, those observations (rows) in our data that contain no missing data.

How do you replace missing values in average in R?

The easiest way to replace NA's in an R data frame is by using the replace_na() function and the mean() function. The first function identifies the missing values, whereas the latter replaces the NA's with the mean.


1 Answers

I think you can use the following solution in tidyverse:

library(dplyr)
library(purrr)

df[1] %>%
  bind_cols(map_dfc(2:length(df), function(x) {
    df[[x]][which(is.na(df[[x]]))] <- setdiff(df[[1]], df[[x]][!is.na(df[[x]])])
    df[x]
  }))

# A tibble: 5 x 4
  `Col A` `Col B` `Col C` `Col D`
    <dbl>   <dbl>   <dbl>   <dbl>
1      11      15      14      13
2      12      11      15      14
3      13      12      11      15
4      14      13      12      11
5      15      14      13      12

Or in base R we could do:

do.call(cbind, Reduce(function(x, y) {
  i <- which(is.na(df[[y]]))
  df[[y]][i] <- sort(setdiff(x, df[[y]]))
  df[[y]]
}, init = df[[1]], 2:length(df), accumulate = TRUE)) |>
  as.data.frame() |>
  setNames(paste0("Col", LETTERS[1:length(df)]))

  ColA ColB ColC ColD
1   11   15   14   13
2   12   11   15   14
3   13   12   11   15
4   14   13   12   11
5   15   14   13   12
like image 134
Anoushiravan R Avatar answered Oct 12 '22 15:10

Anoushiravan R