Delete a row from a dataframe if its column values are found in another

Question

df1 = {
    'vouchers': [100, 200, 300, 400],
    'units': [11, 12, 12, 13],
    'some_other_data': ['a', 'b', 'c', 'd'],
    }
df2 = {
    'vouchers': [500, 200, 600, 300],
    'units': [11, 12, 12, 13],
    'some_other_data': ['b', 'd', 'c', 'a'],
    }

Given the two dataframes like above, I want to do the following: if voucher from df1 can be found in df2, and their corresponding unit is the same, then delete the entire voucher row from df1.

So in this case the desired output would be:

df1 = {
    'vouchers': [100, 300, 400],
    'units': [11, 12, 13],
    'some_other_data': ['a', 'c', 'd'],
    }

What would be the best way to achieve this?

cs95 · Accepted Answer

You can do this efficiently with index operations, using pd.Index.isin:

u = df1.set_index(['vouchers', 'units'])
df1[~u.index.isin(pd.MultiIndex.from_arrays([df2.vouchers, df2.units]))]

   vouchers  units some_other_data
0       100     11               a
2       300     12               c
3       400     13               d

BENY · Answer

Doing with merge indicator , after we get the index need to remove , using drop

idx=df1.merge(df2,on=['vouchers','units'],indicator=True,how='left').\
     loc[lambda x : x['_merge']=='both'].index
df1=df1.drop(idx,axis=0)
df1
Out[374]: 
   vouchers  units some_other_data
0       100     11               a
2       300     12               c
3       400     13               d

Delete a row from a dataframe if its column values are found in another

Tags:

python

pandas

dataframe

barciewicz

2 Answers

cs95

BENY

Recent Activity

Donate For Us

Delete a row from a dataframe if its column values are found in another

Tags:

python

pandas

dataframe

barciewicz

2 Answers

cs95

BENY

Related questions

Recent Activity

Donate For Us