I'm trying to merge two dataframes in pandas on a common column name (orderid).  The resulting dataframe (the merged dataframe) is dropping the orderid from the 2nd data frame.  Per the documentation, the 'on' column should be kept unless you explicitly tell it not to.  
import pandas as pd    
df = pd.DataFrame([[1,'a'], [2, 'b'], [3, 'c']], columns=['orderid', 'ordervalue'])
df['orderid'] = df['orderid'].astype(str)
df2 = pd.DataFrame([[1,200], [2, 300], [3, 400], [4,500]], columns=['orderid', 'ordervalue'])
df2['orderid'] = df2['orderid'].astype(str)
pd.merge(df, df2, on='orderid', how='outer', copy=True, suffixes=('_left', '_right'))
Which outputs this:
|      |orderid | ordervalue_left | ordervalue_right |
|------|--------|-----------------|------------------|
| 0    | 1      | a               | 200              |
| 1    | 2      | b               | 300              |
| 2    | 3      | c               | 400              |
| 3    | 4      |                 | 500              |
What I am trying to create is this:
|      | orderid_left | ordervalue_left | orderid_left | ordervalue_right |
|------|--------------|-----------------|--------------|------------------|
| 0    | 1            | a               | 1            | 200              |
| 1    | 2            | b               | 2            | 300              |
| 2    | 3            | c               | 3            | 400              |
| 3    | NaN          | NaN             | 4            | 500              |
How should I write this?
Rename the orderid columns so that df has a column named orderid_left,
and df2 has a column named orderid_right:
import pandas as pd    
df = pd.DataFrame([[1,'a'], [2, 'b'], [3, 'c']], columns=['orderid', 'ordervalue'])
df['orderid'] = df['orderid'].astype(str)
df2 = pd.DataFrame([[1,200], [2, 300], [3, 400], [4,500]], columns=['orderid', 'ordervalue'])
df2['orderid'] = df2['orderid'].astype(str)
df = df.rename(columns={'orderid':'orderid_left'})
df2 = df2.rename(columns={'orderid':'orderid_right'})
result = pd.merge(df, df2, left_on='orderid_left', right_on='orderid_right', 
                  how='outer', suffixes=('_left', '_right'))
print(result)
yields
  orderid_left ordervalue_left orderid_right  ordervalue_right
0            1               a             1               200
1            2               b             2               300
2            3               c             3               400
3          NaN             NaN             4               500
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With