Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

list of columns in common in two pandas dataframes

I'm considering merge operations on dataframes each with a large number of columns. Don't want the result to have two columns with the same name. Am trying to view a list of column names in common between the two frames:

import pandas as pd

a = [{'A': 3, 'B': 5, 'C': 3, 'D': 2},{'A': 2,  'B': 4, 'C': 3, 'D': 9}]
df1 = pd.DataFrame(a)
b = [{'F': 0,  'M': 4,  'B': 2,  'C': 8 },{'F': 2,  'M': 4, 'B': 3, 'C': 9}]
df2 = pd.DataFrame(b)

df1.columns
>> Index(['A', 'B', 'C', 'D'], dtype='object')
df2.columns
>> Index(['B', 'C', 'F', 'M'], dtype='object')

(df2.columns).isin(df1.columns)
>> array([ True,  True, False, False])

How do I operate that NumPy boolean array on the Index object so it just gives back a list of the columns in common?

like image 402
cardamom Avatar asked Jan 31 '18 09:01

cardamom


People also ask

How can you tell if two DataFrames have the same columns?

DataFrame - equals() function This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal. The column headers do not need to have the same type, but the elements within the columns must be the same dtype.

How do I merge two DataFrames in pandas based on common column?

To merge two Pandas DataFrame with common column, use the merge() function and set the ON parameter as the column name.


1 Answers

Use numpy.intersect1d or intersection:

a = np.intersect1d(df2.columns, df1.columns)
print (a)
['B' 'C']

a = df2.columns.intersection(df1.columns)
print (a)
Index(['B', 'C'], dtype='object')

Alternative syntax for the latter option:

df1.columns & df2.columns
like image 159
jezrael Avatar answered Oct 17 '22 06:10

jezrael