join data frames and replace one column with another

Question

I have two data frames, one with all my data, and another with a corrected ID number for some of the data.

When I attempt to join these values with either a left, inner or full join, I end up with two ID columns (ID.x and ID.y). Is there anyway to do this so that the ID in my corrected dataset simply replaces the ID in the full dataset?

E.g.,

df.full <- left_join(df.full, correctID, 
                     by = c("value"))

I've tried both copy = TRUE and FALSE, but this doesn't seem to help.

Gregor Thomas · Accepted Answer

There are a few cases:

If you always want the value from correctID, just drop the ID column from df.full first:

df.full %>%
  select(-ID) %>%
  left_join(correctID, by = "value")

If correctID isn't complete, and you only want to use it when present:

df.full %>%
  left_join(correctID, by = "value") %>%
  mutate(ID = coalesce(ID.y, ID.x)) %>%
  select(-ID.y, -ID.x)

You can, of course, reverse that in the opposite case (only want to use correctID when df.full$ID is missing).

join data frames and replace one column with another

Tags:

r

dplyr

tnt

1 Answers

Gregor Thomas

Recent Activity

Donate For Us

join data frames and replace one column with another

Tags:

r

dplyr

tnt

1 Answers

Gregor Thomas

Related questions

Recent Activity

Donate For Us