Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trying to compare two dataframes with different rows and columns in R

Tags:

r

compare

I am trying to compare two different dataframes which have different columns and rows in R. Need to get the same data be df3, any row or column are different data be df4.In my example, id F, col1 and col2 in both two tables is the same.but other cols are not.

Below is what my dataset looks like:

set.seed(22)

df1 <- data.frame(id=sample(LETTERS, 9, FALSE), col1=sample(0:2, 9, TRUE),
                  col2 = sample(0:2, 9, TRUE))
df2 <- data.frame(id=sample(LETTERS, 17, FALSE), col1=sample(0:2, 17, TRUE),
                  col2 = sample(0:2, 17, TRUE),
                  col6 = sample(0:2, 17, TRUE))

df1 enter image description here

df2 enter image description here

I've read many solutions but have not yet found a concise solution, any suggestions out there? Any help is much appreciated.

like image 531
Ollie Avatar asked Sep 02 '25 13:09

Ollie


1 Answers

You can use generics::intersect() to find the common values and generics::setdiff() to find the different values. Note you need to specify the generics package to get it in the format you want.

df3 <- generics::intersect(df1, df2[,1:3])
  #    id col1 col2
  # 1  F    1    0
  # 2  K    0    2

df4 <- generics::setdiff(df1, df2[,1:3])
  # id col1 col2
  #1  I    1    2
  #2  X    2    0
  #3  J    0    2
  #4  L    0    1
  #5  Q    1    0
  #6  E    1    0
  #7  C    1    0
like image 172
jpsmith Avatar answered Sep 05 '25 04:09

jpsmith