I would like to merge two DataFrames while creating a multilevel column naming scheme denoting which dataframe the rows came from. For example:
In [98]: A=pd.DataFrame(np.arange(9.).reshape(3,3),columns=list('abc'))
In [99]: A
Out[99]:
a b c
0 0 1 2
1 3 4 5
2 6 7 8
In [100]: B=A.copy()
If I use pd.merge()
, then I get
In [104]: pd.merge(A,B,left_index=True,right_index=True)
Out[104]:
a_x b_x c_x a_y b_y c_y
0 0 1 2 0 1 2
1 3 4 5 3 4 5
2 6 7 8 6 7 8
Which is what I expect with that statement, what I would like (but I don't know how to get!) is:
In [104]: <<one or more statements>>
Out[104]:
A B
a b c a b c
0 0 1 2 0 1 2
1 3 4 5 3 4 5
2 6 7 8 6 7 8
Can this be done without changing the original pd.DataFrame
calls? I am reading the data in the dataframes in from .csv files and that might be my problem.
It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name. axis: 0 refers to the row axis and1 refers the column axis. join: Type of join.
merge() for combining data on common columns or indices. .join() for combining data on a key column or an index. concat() for combining DataFrames across rows or columns.
Different column names are specified for merges in Pandas using the “left_on” and “right_on” parameters, instead of using only the “on” parameter. Merging dataframes with different names for the joining variable is achieved using the left_on and right_on arguments to the pandas merge function.
first case can be ordered arbitrarily among A,B (not the columns, just the order A or B) 2nd should preserve ordering
IMHO this is pandonic!
In [5]: concat(dict(A = A, B = B),axis=1)
Out[5]:
A B
a b c a b c
0 0 1 2 0 1 2
1 3 4 5 3 4 5
2 6 7 8 6 7 8
In [6]: concat([ A, B ], keys=['A','B'],axis=1)
Out[6]:
A B
a b c a b c
0 0 1 2 0 1 2
1 3 4 5 3 4 5
2 6 7 8 6 7 8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With