Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing NAs between two rows with identical values in a specific column

Tags:

r

I have a dataframe with multiple columns and I want to replace NAs in one column if they are between two rows with an identical number. Here is my data:

    v1 v2 
    1  2  
    NA 3
    NA 2
    1  1
    NA 7
    NA 2
    3  1

I basically want to start from the beginning of the data frame and replcae NAs in column v1 with previous Non NA if the next Non NA matches the previous one. That been said, I want the result to be like this:

    v1 v2 
    1  2  
    1 3
    1 2
    1  1
    NA 7
    NA 2
    3  1        

As you may see, rows 2 and 3 are replaced with number "1" because row 1 and 4 had an identical number but rows 5,6 stays the same because the non na values in rows 4 and 7 are not identical. I have been twicking a lot but so far no luck. Thanks

like image 656
Fatima Avatar asked Mar 09 '23 06:03

Fatima


2 Answers

Here is an idea using zoo package. We basically fill NAs in both directions and set NA the values that are not equal between those directions.

library(zoo)

ind1 <- na.locf(df$v1, fromLast = TRUE)
df$v1 <- na.locf(df$v1)
df$v1[df$v1 != ind1] <- NA

which gives,

 v1 v2
1  1  2
2  1  3
3  1  2
4  1  1
5 NA  7
6 NA  2
7  3  1
like image 165
Sotos Avatar answered May 20 '23 00:05

Sotos


Here is a similar approach in tidyverse using fill

library(tidyverse)
df1 %>%
  mutate(vNew = v1) %>%
  fill(vNew, .direction = 'up') %>%
  fill(v1)  %>%
  mutate(v1 = replace(v1, v1 != vNew, NA)) %>%
  select(-vNew)
#  v1 v2
#1  1  2
#2  1  3
#3  1  2
#4  1  1
#5 NA  7
#6 NA  2
#7  3  1
like image 38
akrun Avatar answered May 19 '23 22:05

akrun