Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Minus operation of data frames

Tags:

I have 2 data frames df1 and df2.

df1 <- data.frame(c1=c("a","b","c","d"),c2=c(1,2,3,4) ) df2 <- data.frame(c1=c("c","d","e","f"),c2=c(3,4,5,6) )  > df1   c1 c2 1  a  1 2  b  2 3  c  3 4  d  4  > df2   c1 c2 1  c  3 2  d  4 3  e  5 4  f  6 

I need to perform set operation of these 2 data frames. I used merge(df1,df2,all=TRUE) and merge(df1,df2,all=FALSE) method to get the union and intersection of these data frames and got the required output. What is the function to get the minus of these data frames,that is all the positions existing on one data frame but not the other? I need the following output.

 c1 c2 1  a  1 2  b  2 
like image 931
Dinoop Nair Avatar asked Apr 22 '13 09:04

Dinoop Nair


People also ask

Can you subtract 2 Dataframes?

Pretty simple. Use the except() to subtract or find the difference between two dataframes.

How do I subtract two data frames in R?

To do this, we simply need to use minus sign. For example, if we have data-frames df1 and df2 then the subtraction can be found as df1-df2.

How do you subtract a value from a column in a DataFrame?

The sub() method subtracts each value in the DataFrame with a specified value. The specified value must be an object that can be subtracted from the values in the DataFrame.


1 Answers

I remember coming across this exact issue quite a few months back. Managed to sift through my Evernote one-liners.

Note: This is not my solution. Credit goes to whoever wrote it (whom I can't seem to find at the moment).

If you don't worry about rownames then you can do:

df1[!duplicated(rbind(df2, df1))[-seq_len(nrow(df2))], ] #   c1 c2 # 1  a  1 # 2  b  2 

Edit: A data.table solution:

dt1 <- data.table(df1, key="c1") dt2 <- data.table(df2) dt1[!dt2] 

or better one-liner (from v1.9.6+):

setDT(df1)[!df2, on="c1"] 

This returns all rows in df1 where df2$c1 doesn't have a match with df1$c1.

like image 121
Arun Avatar answered Nov 17 '22 15:11

Arun