Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merge dataframes of different sizes

Tags:

dataframe

r

I have two data frames d1 and d2 respectively as:

x   y  z
10  10 7
10  12 6
11  10 8
11  12 2
12  10 1
12  12 5

x  y  z
10 10 100
11 10 200
12 12 400

I want something like:

x   y  z
10  10 100
10  12 6
11  10 200
11  12 2
12  10 1
12  12 400

I am really sorry for the trivial question, I could not get the answer.

like image 416
Pankaj Avatar asked Dec 10 '22 19:12

Pankaj


1 Answers

From your description I understand that you want to replace the z values in d1 with the z values in d2 when x & y match.

Using base R:

d3 <- merge(d1, d2, by = c("x","y"), all.x = TRUE)
d3[is.na(d3$z.y),"z.y"] <- d3[is.na(d3$z.y),"z.x"]
d3 <- d3[,-3]
names(d3)[3] <- "z"

which gives:

> d3
   x  y   z
1 10 10 100
2 10 12   6
3 11 10 200
4 11 12   2
5 12 10   1
6 12 12 400

Using the data.table-package:

library(data.table)

setDT(d1) # convert the data.frame to a data.table
setDT(d2) # idem

# join the two data.table's and replace the values
d1[d2, on = .(x, y), z := i.z]

or in one go:

setDT(d1)[setDT(d2), on = .(x, y), z := i.z]

which gives:

> d1
    x  y   z
1: 10 10 100
2: 10 12   6
3: 11 10 200
4: 11 12   2
5: 12 10   1
6: 12 12 400

Using the dplyr package:

d3 <- left_join(d1, d2, by = c("x","y")) %>%
  mutate(z.y = ifelse(is.na(z.y), z.x, z.y)) %>%
  select(-z.x) %>%
  rename(z = z.y)

Since release 0.5.0 you can also use the coalesce-function for this (thx to Laurent Hostert for bringing it to my attention):

d3 <- left_join(d1, d2, by = c("x","y")) %>% 
  mutate(z = coalesce(z.y, z.x)) %>% 
  select(-c(z.x, z.y))
like image 63
Jaap Avatar answered Jan 03 '23 09:01

Jaap