Wondering how to compute set difference in Python's Pandas using two different dataframes.
One dataframe (df1) is of the format:
State City Population
NY Albany 856654
WV Wheeling 23434
SC Charleston 35323
OH Columbus 343534
WV Charleston 34523
And the second data frame (df2) is
State City
WV Wheeling
OH Columns
And I need an operation that returns the following data frame
State City Population
NY Albany 856654
SC Charleston 35323
WV Charleston 34523
Essentially, I can't figure out how to "subtract" df2 from df1 based on the 2 columns (need both since I'll have repeats of city names across different states).
Do a left join with indicator
which gives information on the origin of each row, then you can filter based on the indicator
:
df1.merge(df2, indicator=True, how="left")[lambda x: x._merge=='left_only'].drop('_merge',1)
#State City Population
#0 NY Albany 856654
#2 SC Charleston 35323
#4 WV Charleston 34523
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With