df1 = {
'vouchers': [100, 200, 300, 400],
'units': [11, 12, 12, 13],
'some_other_data': ['a', 'b', 'c', 'd'],
}
df2 = {
'vouchers': [500, 200, 600, 300],
'units': [11, 12, 12, 13],
'some_other_data': ['b', 'd', 'c', 'a'],
}
Given the two dataframes like above, I want to do the following: if voucher from df1
can be found in df2
, and their corresponding unit is the same, then delete the entire voucher row from df1
.
So in this case the desired output would be:
df1 = {
'vouchers': [100, 300, 400],
'units': [11, 12, 13],
'some_other_data': ['a', 'c', 'd'],
}
What would be the best way to achieve this?
You can do this efficiently with index operations, using pd.Index.isin
:
u = df1.set_index(['vouchers', 'units'])
df1[~u.index.isin(pd.MultiIndex.from_arrays([df2.vouchers, df2.units]))]
vouchers units some_other_data
0 100 11 a
2 300 12 c
3 400 13 d
Doing with merge
indicator
, after we get the index
need to remove , using drop
idx=df1.merge(df2,on=['vouchers','units'],indicator=True,how='left').\
loc[lambda x : x['_merge']=='both'].index
df1=df1.drop(idx,axis=0)
df1
Out[374]:
vouchers units some_other_data
0 100 11 a
2 300 12 c
3 400 13 d
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With