Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging dataframes on index with pandas

I have two dataframes and each one has two index columns. I would like to merge them. For example, the first dataframe is the following:

                   V1  A      1/1/2012    12        2/1/2012    14 B      1/1/2012    15        2/1/2012    8 C      1/1/2012    17        2/1/2012    9 

The second dataframe is the following:

                   V2  A      1/1/2012    15        3/1/2012    21              B      1/1/2012    24        2/1/2012    9 D      1/1/2012    7        2/1/2012    16 

and as result I would like to get the following:

                   V1   V2  A      1/1/2012    12   15        2/1/2012    14   N/A        3/1/2012    N/A  21            B      1/1/2012    15   24        2/1/2012    8    9 C      1/1/2012    7    N/A        2/1/2012    16   N/A D      1/1/2012    N/A  7        2/1/2012    N/A  16 

I have tried a few versions using the pd.merge and .join methods, but nothing seems to work. Do you have any suggestions?

like image 311
km1234 Avatar asked Apr 11 '16 02:04

km1234


People also ask

How do I merge index columns in pandas?

The merge() function is used to merge DataFrame or named Series objects with a database-style join. The join is done on columns or indexes. If joining columns on columns, the DataFrame indexes will be ignored. Otherwise if joining indexes on indexes or indexes on a column or columns, the index will be passed on.

Does PD concat merge on index?

pd. concat joins on the index and can join two or more DataFrames at once. It does a full outer join by default.

How do I keep index from merging pandas?

How to Keep index when using Pandas Merge. By default, Pandas merge creates a new integer index for the merged DataFrame. If we wanted to preserve the index from the first DataFrame as the index of the merged DataFrame, we can specify the index explicitly using . set_axis(df1.


1 Answers

You should be able to use join, which joins on the index as default. Given your desired result, you must use outer as the join type.

>>> df1.join(df2, how='outer')             V1  V2 A 1/1/2012  12  15   2/1/2012  14 NaN   3/1/2012 NaN  21 B 1/1/2012  15  24   2/1/2012   8   9 C 1/1/2012  17 NaN   2/1/2012   9 NaN D 1/1/2012 NaN   7   2/1/2012 NaN  16 

Signature: _.join(other, on=None, how='left', lsuffix='', rsuffix='', sort=False) Docstring: Join columns with other DataFrame either on index or on a key column. Efficiently Join multiple DataFrame objects by index at once by passing a list.

like image 105
Alexander Avatar answered Oct 26 '22 04:10

Alexander