I have a df (df1) that looks like:
df1 = pd.DataFrame([
['YYZ', 'SFO', 1],
['YYZ', 'YYD', 1],
['YYZ', 'EWR', 1],
['YYZ', 'DFW', 1],
['YYZ', 'LAX', 1],
['YYZ', 'YYC', 1]
], columns=['city1', 'city2', 'val'])
I have another df (df2) that is a subset of df1:
df2 = pd.DataFrame([
['YYZ', 'SFO', 1],
['YYZ', 'YYD', 1]
], columns=['city1', 'city2', 'val'])
I want all rows in df1 that are NOT in df2.
I've tried various options described in this post conditional slicing based on values of 2 columns, however I haven't been able to get it to work.
Your help would be appreciated.
merge
with indicator=True
query
to strip out only those with 'left_only'
df1.merge(
df2, how='outer', indicator=True
).query('_merge == "left_only"').drop('_merge', 1)
city1 city2 val
2 YYZ EWR 1
3 YYZ DFW 1
4 YYZ LAX 1
5 YYZ YYC 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With