Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find complement of a data frame (anti - join)

Tags:

merge

join

r

r-faq

I have two data frames(df and df1). df1 is subset of df. I want to get a data frame which is complement of df1 in df, i.e. return rows of the first data set which are not matched in the second. For example let,

data frame df:

heads row1 row2 row3 row4 row5 

data frame df1:

heads row3 row5 

Then the desired output df2 is:

heads row1 row2 row4 
like image 252
oercim Avatar asked Feb 24 '15 17:02

oercim


1 Answers

You could also do some type of anti join with data.tables binary join

library(data.table) setkey(setDT(df), heads)[!df1] #    heads # 1:  row1 # 2:  row2 # 3:  row4 

EDIT: Starting data.table v1.9.6+ we can join data.tables without setting keys while using on

setDT(df)[!df1, on = "heads"] 

EDIT2: Starting data.table v1.9.8+ fsetdiff was introduced which is basically a variation of the solution above, just over all the column names of the x data.table, e.g. x[!y, on = names(x)]. If all set to FALSE (the default behavior), then only unique rows in x will be returned. For the case of only one column in each data.table the following will be equivalent to the previous solutions

fsetdiff(df, df1, all = TRUE) 
like image 148
David Arenburg Avatar answered Sep 19 '22 19:09

David Arenburg