I have two dataframes, A and B, and I want to get those in A but not in B, just like the one right below the top left corner.
Dataframe A has columns ['a','b' + others]
and B has columns ['a','b' + others]
. There are no NaN values. I tried the following:
1.
dfm = dfA.merge(dfB, on=['a','b']) dfe = dfA[(~dfA['a'].isin(dfm['a']) | (~dfA['b'].isin(dfm['b'])
2.
dfm = dfA.merge(dfB, on=['a','b']) dfe = dfA[(~dfA['a'].isin(dfm['a']) & (~dfA['b'].isin(dfm['b'])
3.
dfe = dfA[(~dfA['a'].isin(dfB['a']) | (~dfA['b'].isin(dfB['b'])
4.
dfe = dfA[(~dfA['a'].isin(dfB['a']) & (~dfA['b'].isin(dfB['b'])
but when I get len(dfm)
and len(dfe)
, they don't sum up to dfA
(it's off by a few numbers). I've tried doing this on dummy cases and #1 works, so maybe my dataset may have some peculiarities I am unable to reproduce.
What's the right way to do this?
Pandas Left Join using join() panads. DataFrame. join() method by default does the leftt Join on row indices and provides a way to do join on other join types. It also supports different params, refer to pandas join() for syntax, usage, and more examples.
drop() method you can drop/remove/delete rows from DataFrame. axis param is used to specify what axis you would like to remove. By default axis = 0 meaning to remove rows. Use axis=1 or columns param to remove columns.
We can exclude one column from the pandas dataframe by using the loc function. This function removes the column based on the location. Here we will be using the loc() function with the given data frame to exclude columns with name,city, and cost in python.
Check out this link
df = pd.merge(dfA, dfB, on=['a','b'], how="outer", indicator=True) df = df[df['_merge'] == 'left_only']
One liner :
df = pd.merge(dfA, dfB, on=['a','b'], how="outer", indicator=True ).query('_merge=="left_only"')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With