Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

merging on multiple columns R

Tags:

merge

r

I'm surprised if this isn't a duplicate, but I couldn't find the answer anywhere else.

I have two data frames, data1 and data2, that differ in one column, but the rest of the columns are the same. I would like to merge them on a unique identifying column, id. However, in the event an ID from data2 does not have a match in data1, I want the entry in data2 to be appended at the bottom, similar to plyr::rbind.fill() rather than renaming all the corresponding columns in data2 as column1.x and column1.y. I realize this isn't the clearest explanation, maybe I shouldn't be working on a Saturday. Here is code to create the two dataframes, and the desired output:

spp1 <- c('A','B','C')
spp2 <- c('B','C','D')
trait.1 <- rep(1.1,length(spp1))
trait.2 <- rep(2.0,length(spp2))
id_1 <- c(1,2,3)
id_2 <- c(2,9,7)

data1 <- data.frame(spp1,trait.1,id_1)
data2 <- data.frame(spp2,trait.2,id_2)
colnames(data1) <- c('spp','trait.1','id')
colnames(data2) <- c('spp','trait.2','id')

Desired output:

  spp trait.1 trait.2 id
1   A     1.1      NA  1
2   B     1.1       2  2
3   C     1.1      NA  3
4   C      NA       2  9
5   D      NA       2  7
like image 621
colin Avatar asked Mar 06 '23 02:03

colin


1 Answers

Try this:

library(dplyr)

full_join(data1, data2, by = c("id", "spp"))

Output:

  spp trait.1 id trait.2
1   A     1.1  1      NA
2   B     1.1  2       2
3   C     1.1  3      NA
4   C      NA  9       2
5   D      NA  7       2

Alternatively, also merge would work:

merge(data1, data2, by = c("id", "spp"), all = TRUE)
like image 161
arg0naut91 Avatar answered Mar 21 '23 01:03

arg0naut91