Replace values from another dataframe by IDs

Question

I have two data frames::

as1 <- data.frame(ID = c(1,2,3,4,5,6),
                  pID = c(21,22,23,24,25,26),
                  Values = c(435,33,45,NA, NA,12))
as2 <- data.frame(ID = c(4,5),
                  pid = c(24,25),
                  Values = c(544, 676))

I need to replace the NA values in as1 with those in as2 by matching ID and pID

I need to get the result data frame as:

  resultdf
    ID pID Values
    1  1  21    435
    2  2  22     33
    3  3  23     45
    4  4  24    544
    5  5  25    676
    6  6  26     12

I tried doing subset and then na.omit() and then rbinding... but I'm losing the index.

lmo · Accepted Answer

Here are two base R solutions.

First, using match in "ID" to select the elements of "Value" in as1 to fill in:

as1$Values[match(as2$ID, as1$ID)] <- as2$Values

as1
  ID pID Values
1  1  21    435
2  2  22     33
3  3  23     45
4  4  24    544
5  5  25    676
6  6  26     12

This only works if ID is the true ID for both data sets (that is, pid is "irrelevant"). Second, in the case that pid is also needed, you could use merge and then "collapse" the two values columns as follows:

df <- merge(as1, as2, by.x=c("ID", "pID"), by.y=c("ID", "pid"), all=TRUE)

This produces a four column data frame with two values columns. Collapse these into a single column with ifelse:

df <- cbind(df[c(1,2)], "Values"=with(df, ifelse(is.na(Values.y), Values.x, Values.y)))

df
  ID pID Values
1  1  21    435
2  2  22     33
3  3  23     45
4  4  24    544
5  5  25    676
6  6  26     12

akrun · Answer

Or another option is data.table

library(data.table)
setDT(as1)[as2, Values := i.Values , on =.(ID, pid)]
as1
#    ID pid Values
#1:  1  21    435
#2:  2  22     33
#3:  3  23     45
#4:  4  24    544
#5:  5  25    676
#6:  6  26     12

Marta · Answer

Try with dplyr library. Firstly join both tables and then create new column with values instead NA's:

library("dplyr")

as1 <- data.frame(ID = c(1,2,3,4,5,6), pid = c(21,22,23,24,25,26),Values = c(435,33,45,NA, NA,12))
as2 <- data.frame(ID = c(4,5),pid = c(24,25), Values = c(544, 676))

left_join(as1, as2, by = c("ID", "pid")) %>% 
    mutate(Values = ifelse(is.na(Values.x), Values.y, Values.x)) %>% 
    select(ID, pid, Values)

# ID pid Values
# 1  1  21    435
# 2  2  22     33
# 3  3  23     45
# 4  4  24    544
# 5  5  25    676
# 6  6  26     12

aichao · Answer

Following Marta's dplyr solution, we can use coalesce instead to combine the merged Value.x and Value.y:

library(dplyr)
res <- as1 %>% left_join(as2, by = c("ID", "pID"="pid")) %>%
               mutate(Values=coalesce(Values.x,Values.y)) %>%
               select(-Values.x,-Values.y)
##  ID pID Values
##1  1  21    435
##2  2  22     33
##3  3  23     45
##4  4  24    544
##5  5  25    676
##6  6  26     12

Note also the syntax used for the by parameter in left_join. Here, we join by ID and by pID for as1 and pid for as2.

Replace values from another dataframe by IDs

Tags:

dataframe

r

Tarak

4 Answers

lmo

akrun

Marta

aichao

Recent Activity

Donate For Us

Replace values from another dataframe by IDs

Tags:

dataframe

r

Tarak

4 Answers

lmo

akrun

Marta

aichao

Related questions

Recent Activity

Donate For Us