How to remove a pandas dataframe from another dataframe, just like the set subtraction:
a=[1,2,3,4,5] b=[1,5] a-b=[2,3,4] And now we have two pandas dataframe, how to remove df2 from df1:
In [5]: df1=pd.DataFrame([[1,2],[3,4],[5,6]],columns=['a','b']) In [6]: df1 Out[6]: a b 0 1 2 1 3 4 2 5 6 In [9]: df2=pd.DataFrame([[1,2],[5,6]],columns=['a','b']) In [10]: df2 Out[10]: a b 0 1 2 1 5 6 Then we expect df1-df2 result will be:
In [14]: df Out[14]: a b 0 3 4 How to do it?
Thank you.
subtract() function is used for finding the subtraction of dataframe and other, element-wise. This function is essentially same as doing dataframe – other but with a support to substitute for missing data in one of the inputs.
Pandas DataFrame drop() Method The drop() method removes the specified row or column. By specifying the column axis ( axis='columns' ), the drop() method removes the specified column. By specifying the row axis ( axis='index' ), the drop() method removes the specified row.
You can use pandas. concat to concatenate the two dataframes rowwise, followed by drop_duplicates to remove all the duplicated rows in them.
Use pd.concat followed by drop_duplicates(keep=False)
pd.concat([df1, df2, df2]).drop_duplicates(keep=False) It looks like
a b 1 3 4 pd.concat adds the two DataFrames together by appending one right after the other. if there is any overlap, it will be captured by the drop_duplicates method. However, drop_duplicates by default leaves the first observation and removes every other observation. In this case, we want every duplicate removed. Hence, the keep=False parameter which does exactly that.
A special note to the repeated df2. With only one df2 any row in df2 not in df1 won't be considered a duplicate and will remain. This solution with only one df2 only works when df2 is a subset of df1. However, if we concat df2 twice, it is guaranteed to be a duplicate and will subsequently be removed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With