pandas return columns in dataframe that are not in other dataframe

Tags:

python

pandas

I have two dataframes that look like this:

df_1 = pd.DataFrame({
'A' : [1.0, 2.0, 3.0, 4.0],
'B' : [100, 200, 300, 400],
'C' : [2, 3, 4, 5] 
                   })

df_2 = pd.DataFrame({
'B' : [1.0, 2.0, 3.0, 4.0],
'C' : [100, 200, 300, 400],
'D' : [2, 3, 4, 5] 
                  })

Now if I utilize pandas .isin function I can do something nifty like this

>>> print df_2.columns.isin(df_1.columns)
array([ True,  True, False], dtype=bool)

Columns B and C from df_2 exist in df_1 while D doesn't

My question is: does anyone know of a way to return the columns' labels for columns that exist in df_2 but not in df_1

something like this

array([u'D'], dtype=string)

Thank you in advance!

375

asked Mar 26 '17 12:03

cgclip

1 Answers

Pandas index object have set-like properties, so you can directly do:

df_2.columns.difference(df_1.columns)
Index([u'D'], dtype='object')

You can also use operators like &|^ to compute intersection, union and symmetric difference:

df_1.columns & df_2.columns
Index([u'B', u'C'], dtype='object')

df_1.columns | df_2.columns
Index([u'A', u'B', u'C', u'D'], dtype='object')

df_1.columns ^ df_2.columns
Index([u'A', u'D'], dtype='object')

There use to be the -operator for difference, now deprecated:

df_2.columns - df_1.columns
FutureWarning: using '-' to provide set differences with Indexes is deprecated, use .difference()
Index([u'D'], dtype='object')

answered Nov 01 '22 03:11

jrjc

Related questions
                            
                                extract hash seed in unit testing
                            
                                Why is 10/3 equal to 3.3333333333333335 instead of ...332 or ..334?
                            
                                Weighted bins in a distribution hist plot
                            
                                Detect a changed password in Django
                            
                                using best params from gridsearchcv
                            
                                sudo and pip not on the same path
                            
                                Python selenium not work with WebDriverWait
                            
                                Considerations for using ReLU as activation function
                            
                                How to rearrange one list based on a second list of indices [duplicate]
                            
                                python & postgresql: reliably check for updates in a specific table
                            
                                Adding global attribute using xarray
                            
                                Difference between Tensorflow convolution and numpy convolution
                            
                                Escape analysis
                            
                                Pandas - Counting quantity of commas in character field
                            
                                I deleted my dict, but my dict_keys don't mind, why is that?
                            
                                Get the inverse function of a polyfit in numpy
                            
                                error using gmail api tuto using python 3 "except errors.HttpError, error:"
                            
                                Nested merges in pandas with suffixes
                            
                                How to get round the HTTP Error 403: Forbidden with urllib.request using Python 3
                            
                                DynamoDB - How to query a nested attribute boto3

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With