I have 2 data frames df1
and df2
.
df1 <- data.frame(c1=c("a","b","c","d"),c2=c(1,2,3,4) ) df2 <- data.frame(c1=c("c","d","e","f"),c2=c(3,4,5,6) ) > df1 c1 c2 1 a 1 2 b 2 3 c 3 4 d 4 > df2 c1 c2 1 c 3 2 d 4 3 e 5 4 f 6
I need to perform set operation of these 2 data frames. I used merge(df1,df2,all=TRUE)
and merge(df1,df2,all=FALSE)
method to get the union and intersection of these data frames and got the required output. What is the function to get the minus of these data frames,that is all the positions existing on one data frame but not the other? I need the following output.
c1 c2 1 a 1 2 b 2
Pretty simple. Use the except() to subtract or find the difference between two dataframes.
To do this, we simply need to use minus sign. For example, if we have data-frames df1 and df2 then the subtraction can be found as df1-df2.
The sub() method subtracts each value in the DataFrame with a specified value. The specified value must be an object that can be subtracted from the values in the DataFrame.
I remember coming across this exact issue quite a few months back. Managed to sift through my Evernote one-liners.
Note: This is not my solution. Credit goes to whoever wrote it (whom I can't seem to find at the moment).
If you don't worry about rownames
then you can do:
df1[!duplicated(rbind(df2, df1))[-seq_len(nrow(df2))], ] # c1 c2 # 1 a 1 # 2 b 2
Edit: A data.table
solution:
dt1 <- data.table(df1, key="c1") dt2 <- data.table(df2) dt1[!dt2]
or better one-liner (from v1.9.6+):
setDT(df1)[!df2, on="c1"]
This returns all rows in df1
where df2$c1
doesn't have a match with df1$c1
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With