Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove common rows in two dataframes in Pandas?

I have two dataframes - df1 and df2.

df1 has row1,row2,row3,row4,row5
df2 has row2,row5

I want to have a new dataframe such that df1-df2. That is, the resultant dataframe should have rows as - row1,row3,row4.

like image 948
user3243499 Avatar asked Jul 31 '16 06:07

user3243499


People also ask

How do you remove identical rows in two data frames?

You can use pandas. concat to concatenate the two dataframes rowwise, followed by drop_duplicates to remove all the duplicated rows in them.

How do you get common rows from two Dataframes in pandas?

To find the common rows between two DataFrames with merge(), use the parameter “how” as “inner” since it works like SQL Inner Join and this is what we want to achieve.

How do you find the different rows between two data frames in Python?

You can use the DataFrame. diff() function to find the difference between two rows in a pandas DataFrame.


1 Answers

You can use pandas.concat to concatenate the two dataframes rowwise, followed by drop_duplicates to remove all the duplicated rows in them.

In [1]: import pandas as pd
df_1 = pd.DataFrame({"A":["foo", "foo", "foo", "bar"], "B":[0,1,1,1], "C":["A","A","B","A"]})
df_2 = pd.DataFrame({"A":["foo", "bar", "foo", "bar"], "B":[1,0,1,0], "C":["A","B","A","B"]})

In [2]: df = pd.concat([df_1, df_2])

In [3]: df
Out[3]: 
     A  B  C
0  foo  0  A
1  foo  1  A
2  foo  1  B
3  bar  1  A
0  foo  1  A
1  bar  0  B
2  foo  1  A
3  bar  0  B

In [4]: df.drop_duplicates(keep=False)
Out[4]: 
     A  B  C
0  foo  0  A
2  foo  1  B
3  bar  1  A
like image 91
Nickil Maveli Avatar answered Sep 25 '22 15:09

Nickil Maveli