Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

join two pandas dataframe using a specific column

Tags:

I am new with pandas and I am trying to join two dataframes based on the equality of one specific column. For example suppose that I have the followings:

df1
A    B    C
1    2    3
2    2    2

df2
A    B    C
5    6    7
2    8    9

Both dataframes have the same columns and the value of only one column (say A) might be equal. What I want as output is this:

df3
A    B    C   B    C
2    8    9   2    2

The values for column 'A' are unique in both dataframes.

Thanks

like image 618
ahajib Avatar asked Jun 01 '15 22:06

ahajib


People also ask

How do I merge two DataFrames with specific columns in pandas?

We can merge two Pandas DataFrames on certain columns using the merge function by simply specifying the certain columns for merge. Example1: Let's create a Dataframe and then merge them into a single dataframe. Creating a Dataframe: Python3.

How do I merge DataFrames with different column names?

Different column names are specified for merges in Pandas using the “left_on” and “right_on” parameters, instead of using only the “on” parameter. Merging dataframes with different names for the joining variable is achieved using the left_on and right_on arguments to the pandas merge function.

How do I combine two DataFrames in pandas?

The concat() function in pandas is used to append either columns or rows from one DataFrame to another. The concat() function does all the heavy lifting of performing concatenation operations along an axis while performing optional set logic (union or intersection) of the indexes (if any) on the other axes.

Can you join on two columns pandas?

To merge two pandas DataFrames on multiple columns use pandas. merge() method. merge() is considered more versatile and flexible and we also have the same method in DataFrame.


1 Answers

pd.concat([df1.set_index('A'),df2.set_index('A')], axis=1, join='inner')

If you wish to maintain column A as a non-index, then:

pd.concat([df1.set_index('A'),df2.set_index('A')], axis=1, join='inner').reset_index()
like image 138
vk1011 Avatar answered Oct 16 '22 20:10

vk1011