I'm new to R programming and I'm stuck on the example below.
Basically I have two data sets:
dataset1:
ID Category
1 CatZZ
2 CatVV
3 CatAA
4 CatQQ
dataset2:
ID Category
1 Cat600
3 Cat611
I'm trying to overwrite the 'category' values in dataset1 with the 'category' values in dataset2 where there is an ID match between the two data sets.
So the outcome would look something like this:
dataset1:
ID Category
1 Cat600
2 CatVV
3 Cat611
4 CatQQ
In tidyverse you can do:
df1 %>%
left_join(df2, by = c("ID" = "ID")) %>% #Merging the two dfs on ID
mutate(Category = if_else(!is.na(Category.y), Category.y, Category.x)) %>% #If there is a match, taking the value from df2, otherwise from df1
select(ID, Category) #Deleting the redundant variables
ID Category
1 1 Cat600
2 2 CatVV
3 3 Cat611
4 4 CatQQ
Or:
df1 %>%
left_join(df2, by = c("ID" = "ID")) %>% #Merging the two dfs on ID
gather(var, val, -ID) %>% #Transforming the data from wide to long format
arrange(ID) %>% #Arranging by ID
group_by(ID) %>% #Grouping by ID
mutate(Category = if_else(!is.na(nth(val, 2)), nth(val, 2), first(val))) %>% #If non-NA, taking the value from df2, otherwise from df1
spread(var, val) %>% #Returning the data to wide format
select(ID, Category) #Removing the redundant variables
ID Category
<int> <chr>
1 1 Cat600
2 2 CatVV
3 3 Cat611
4 4 CatQQ
Sample data:
df1 <- read.table(text = "ID Category
1 CatZZ
2 CatVV
3 CatAA
4 CatQQ", header = TRUE, stringsAsFactors = FALSE)
df2 <- read.table(text = "ID Category
1 Cat600
3 Cat611", header = TRUE, stringsAsFactors = FALSE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With