Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing values in a dataframe with values from another dataframe by index values

I have a method for replacing values in a dataframe by matching id values. This works well for small data sets but not well on large datasets. Does anyone have a suggestion on how I might make this process more computationally effective?

Below is an example of my R code. I am using the tidyverse package.

# Delta Array small test
test_df <- data.frame(ID = c(1,2,3,4,5,6,7,8,8,9),
                  val = c(1,NA,3,4,5,6,7,8,NA,9))

delta_test <- data.frame(ID = c(2,8,9),
                     val = c(2,100,50))

test_df$val <- ifelse(is.na(delta_test$val[match(test_df$ID, delta_test$ID)]),
                  test_df$val,
                  delta_test$val[match(test_df$ID, delta_test$ID)])

test_df
like image 562
Mikey Johnson Avatar asked Jan 31 '26 03:01

Mikey Johnson


1 Answers

You can try to join test_df with delta_test and select the first non-NA value using coalesce.

library(dplyr)

test_df <- test_df %>%
             left_join(delta_test, by = 'ID') %>%
             mutate(val = coalesce(val.y, val.x)) %>%
             select(ID, val)
test_df
#  ID val
#1   1   1
#2   2   2
#3   3   3
#4   4   4
#5   5   5
#6   6   6
#7   7   7
#8   8 100
#9   8 100
#10  9  50

In base R this can be implemented as :

test_df <- transform(merge(test_df, delta_test, by = 'ID', all.x = TRUE),
                     val = ifelse(is.na(val.y), val.x, val.y))
like image 107
Ronak Shah Avatar answered Feb 02 '26 19:02

Ronak Shah



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!