Find equal columns between two dataframes

Tags:

I have two pandas data frames, a and b:

a1   a2   a3   a4   a5   a6   a7
1    3    4    5    3    4    5
0    2    0    3    0    2    1
2    5    6    5    2    1    2

and

Click to copy

b1   b2   b3   b4   b5   b6   b7
3    5    4    5    1    4    3
0    1    2    3    0    0    2
2    2    1    5    2    6    5

The two data frames contain exactly the same data, but in a different order and with different column names. Based on the numbers in the two data frames, I would like to be able to match each column name in a to each column name in b.

It is not as easy as simply comparing the first row of a with the first row of b as there are duplicated values, for example both a4 and a7 have the value 5 so it is not possible to immediately match them to either b2 or b4.

What is the best way to do this?

400

asked Jan 13 '20 17:01

OD1995

1 Answers

Here's one way leveraging broadcasting to check for equality between both dataframes and taking all on the result to check where all rows match. Then we can obtain indexing arrays for both dataframe's column names from the result of np.where (with @piR's contribution):

Click to copy

i, j = np.where((a.values[:,None] == b.values[:,:,None]).all(axis=0))
dict(zip(a.columns[j], b.columns[i]))
# {'a7': 'b2', 'a6': 'b3', 'a4': 'b4', 'a2': 'b7'}

121

answered Sep 28 '22 09:09

yatu

Related questions
                            
                                How to serialize a Marshmallow field under a different name
                            
                                Flask and Keras model Error ''_thread._local' object has no attribute 'value''?
                            
                                Tool to enforce python code style/standards [closed]
                            
                                How to extract a string between 2 other strings in python?
                            
                                Installing pythonstartup file
                            
                                django: how to do calculation inside the template html page?
                            
                                Best practices for getting the most testing coverage with Django/Python?
                            
                                Iterating over submitted form fields in Flask?
                            
                                marshal dumps faster, cPickle loads faster
                            
                                Programmatically change image resolution
                            
                                How to get python to display current time (eastern)
                            
                                Python: Module Error with pprint, no error with print
                            
                                Installing Django with pip [duplicate]
                            
                                write multiple lines in a file in python
                            
                                Fastest way to populate QTableView from Pandas data frame
                            
                                Anaconda Python installation error
                            
                                How can I make a video from array of images in matplotlib?
                            
                                dask dataframe how to convert column to to_datetime
                            
                                OpenCV installation stuck at [ 99%] Built target opencv_perf_stitching with no error
                            
                                How to Create Dataframe from AWS Athena using Boto3 get_query_results method

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find equal columns between two dataframes

Tags:

python

python-3.x

pandas

OD1995

People also ask

1 Answers

yatu

Recent Activity

Donate For Us