I have two datasets. First one is smaller, but have more precise data. I need to join them, but: 1. If I have some data in Data1 - I'm using only this data. 2. If I haven't got data in Data1, but they're in Data2 - I'm using only data from Data2.
Data1 <- data.frame(
X = c(1,4,7,10,13,16),
Y = c("a", "b", "c", "d", "e", "f")
)
Data2 <- data.frame(
X = c(1:10),
Y = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j")
)
So my data.frame should look like that:
DataJoin <- data.frame(
X = c(1,4,7,10,13,16,7,8,9,10),
Y = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j")
)
How can I do that? I've tried somehow option merge form base package and data.table package, but I couldn't make it happend, as I like.
There's no join needed. You can reformulate the problem as "add the data found in Data2 and not found in Data1 to Data1". So simply do:
id <- Data2$Y %in% Data1$Y
DataJoin <- rbind(Data1,Data2[!id,])
Gives:
> DataJoin
X Y
1 1 a
2 4 b
3 7 c
4 10 d
5 13 e
6 16 f
7 7 g
8 8 h
9 9 i
10 10 j
Using data.table
:
d1 <- data.table(Data1, key="Y")[, X := as.integer(X)]
d2 <- data.table(Data2, key="Y")
# copy d2 so that it doesn't get modified by reference
# i.X refers to the column X of DT in 'i' = d1's 'X'
ans <- copy(d2)[d1, X := i.X]
X Y
1: 1 a
2: 4 b
3: 7 c
4: 10 d
5: 13 e
6: 16 f
7: 7 g
8: 8 h
9: 9 i
10: 10 j
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With