Wondering how to compute set difference in Python's Pandas using two different dataframes.
One dataframe (df1) is of the format:
State  City          Population
NY     Albany        856654
WV     Wheeling      23434
SC     Charleston    35323
OH     Columbus      343534
WV     Charleston    34523
And the second data frame (df2) is
State  City
WV     Wheeling
OH     Columns
And I need an operation that returns the following data frame
State   City        Population
NY      Albany      856654
SC      Charleston  35323
WV      Charleston  34523
Essentially, I can't figure out how to "subtract" df2 from df1 based on the 2 columns (need both since I'll have repeats of city names across different states).
Do a left join with indicator which gives information on the origin of each row, then you can filter based on the indicator:
df1.merge(df2, indicator=True, how="left")[lambda x: x._merge=='left_only'].drop('_merge',1)
#State       City   Population
#0  NY      Albany      856654
#2  SC  Charleston       35323
#4  WV  Charleston       34523
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With