Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate Two DataFrames With Hierarchical Columns

Tags:

python

pandas

I would like to merge two DataFrames while creating a multilevel column naming scheme denoting which dataframe the rows came from. For example:

In [98]: A=pd.DataFrame(np.arange(9.).reshape(3,3),columns=list('abc'))
In [99]: A
Out[99]: 
   a  b  c
0  0  1  2
1  3  4  5
2  6  7  8

In [100]: B=A.copy()

If I use pd.merge(), then I get

In [104]: pd.merge(A,B,left_index=True,right_index=True)
Out[104]: 
   a_x  b_x  c_x  a_y  b_y  c_y
0    0    1    2    0    1    2
1    3    4    5    3    4    5
2    6    7    8    6    7    8

Which is what I expect with that statement, what I would like (but I don't know how to get!) is:

In [104]: <<one or more statements>>
Out[104]: 
     A              B
     a    b    c    a    b    c
0    0    1    2    0    1    2
1    3    4    5    3    4    5
2    6    7    8    6    7    8

Can this be done without changing the original pd.DataFrame calls? I am reading the data in the dataframes in from .csv files and that might be my problem.

like image 746
YourEconProf Avatar asked Sep 23 '13 17:09

YourEconProf


People also ask

Can you concat two DataFrames with different columns?

It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name. axis: 0 refers to the row axis and1 refers the column axis. join: Type of join.

Which are the 3 main ways of combining DataFrames together?

merge() for combining data on common columns or indices. .join() for combining data on a key column or an index. concat() for combining DataFrames across rows or columns.

How do I merge two DataFrames with different column names?

Different column names are specified for merges in Pandas using the “left_on” and “right_on” parameters, instead of using only the “on” parameter. Merging dataframes with different names for the joining variable is achieved using the left_on and right_on arguments to the pandas merge function.


1 Answers

first case can be ordered arbitrarily among A,B (not the columns, just the order A or B) 2nd should preserve ordering

IMHO this is pandonic!

In [5]: concat(dict(A = A, B = B),axis=1)
Out[5]: 
   A        B      
   a  b  c  a  b  c
0  0  1  2  0  1  2
1  3  4  5  3  4  5
2  6  7  8  6  7  8

In [6]: concat([ A, B ], keys=['A','B'],axis=1)
Out[6]: 
   A        B      
   a  b  c  a  b  c
0  0  1  2  0  1  2
1  3  4  5  3  4  5
2  6  7  8  6  7  8
like image 107
Jeff Avatar answered Sep 24 '22 06:09

Jeff