Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: How to concatenate dataframes with different columns?

I tried to find the answer in the official Pandas documentation, but found it more confusing than helpful. Basically I have two dataframes with overlapping, but not identical column lists:

df1:
   A   B
0  22  34
1  78  42

df2:
   B   C
0  76  29
1  11  67

I want to merge/concatenate/append them so that the result is

df3:
   A   B   C
0  22  34  nan
1  78  42  nan
2  nan 76  29
3  nan 11  67

Should be fairly simple, but I've tried several intuitive approaches and always got errors. Can anybody help me?

like image 856
gmolau Avatar asked May 30 '17 03:05

gmolau


People also ask

How can I join two DataFrames in pandas with different column names?

Different column names are specified for merges in Pandas using the “left_on” and “right_on” parameters, instead of using only the “on” parameter. Merging dataframes with different names for the joining variable is achieved using the left_on and right_on arguments to the pandas merge function.

How do I concatenate DataFrames in pandas?

Use pandas. concat() to concatenate/merge two or multiple pandas DataFrames across rows or columns. When you concat() two pandas DataFrames on rows, it creates a new Dataframe containing all rows of two DataFrames basically it does append one DataFrame with another.

Can you merge DataFrames on multiple columns?

You can pass two DataFrame to be merged to the pandas. merge() method. This collects all common columns in both DataFrames and replaces each common column in both DataFrame with a single one.

Can you merge two DataFrames of different lengths pandas?

It can be done using the merge() method. Below are some examples that depict how to merge data frames of different lengths using the above method: Example 1: Below is a program to merge two student data frames of different lengths.


1 Answers

If you just want to concatenate the dataframes you can use.

pd.concat([df1,df2])

output:

      A   B     C
0  22.0  34   NaN
1  78.0  42   NaN
0   NaN  76  11.0
1   NaN  11  67.0

Then you can reset_index to recreate a simple incrementing index.

pd.concat([df,df2]).reset_index(drop = True)

Output:

      A   B     C
0  22.0  34   NaN
1  78.0  42   NaN
2   NaN  76  11.0
3   NaN  11  67.0
like image 194
Scott Boston Avatar answered Nov 15 '22 16:11

Scott Boston