Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove rows in a Pandas dataframe if the same row exists in another dataframe?

Tags:

python

pandas

I have two dataframes:

 df1 = row1;row2;row3  df2 = row4;row5;row6;row2 

I want my output dataframe to only contain the rows unique in df1, i.e.:

df_out = row1;row3 

How do I get this most efficiently?

This code does what I want, but using 2 for-loops:

a = pd.DataFrame({0:[1,2,3],1:[10,20,30]}) b = pd.DataFrame({0:[0,1,2,3],1:[0,1,20,3]})  match_ident = [] for i in range(0,len(a)):     found=False     for j in range(0,len(b)):         if a[0][i]==b[0][j]:             if a[1][i]==b[1][j]:                 found=True     match_ident.append(not(found))  a = a[match_ident] 
like image 800
RRC Avatar asked Jun 22 '17 18:06

RRC


People also ask

How do you remove rows in a DataFrame which are present in another DataFrame?

To drop a row or column in a dataframe, you need to use the drop() method available in the dataframe. You can read more about the drop() method in the docs here. Rows are labelled using the index number starting with 0, by default. Columns are labelled using names.

How do you delete common rows in two Dataframes in pandas?

By using pandas. DataFrame. drop_duplicates() method you can drop/remove/delete duplicate rows from DataFrame. Using this method you can drop duplicate rows on selected multiple columns or all columns.

How do I delete rows in pandas DataFrame based on condition?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).


Video Answer


1 Answers

You an use merge with parameter indicator and outer join, query for filtering and then remove helper column with drop:

DataFrames are joined on all columns, so on parameter can be omit.

print (pd.merge(a,b, indicator=True, how='outer')          .query('_merge=="left_only"')          .drop('_merge', axis=1))    0   1 0  1  10 2  3  30 
like image 162
jezrael Avatar answered Sep 19 '22 00:09

jezrael