How can I force a suffix on a merge or join. I understand it's possible to provide one if there is a collision but in my case I'm merging df1 with df2 which doesn't cause any collision but then merging again on df2 which uses the suffixes but I would prefer for each merge to have a suffix because it gets confusing if I do different combinations as you could imagine.

2 Answers

You could force a suffix on the actual DataFrame:

In [11]: df_a = pd.DataFrame([[1], [2]], columns=['A'])  In [12]: df_b = pd.DataFrame([[3], [4]], columns=['B'])  In [13]: df_a.join(df_b) Out[13]:     A  B 0  1  3 1  2  4 

By appending to it's column's names:

In [14]: df_a.columns = df_a.columns.map(lambda x: str(x) + '_a')  In [15]: df_a Out[15]:     A_a 0    1 1    2 

Now joins won't need the suffix correction, whether they collide or not:

In [16]: df_b.columns = df_b.columns.map(lambda x: str(x) + '_b')  In [17]: df_a.join(df_b) Out[17]:     A_a  B_b 0    1    3 1    2    4 
As of pandas version 0.24.2 you can add a suffix to column names on a DataFrame using the add_suffix method.

This makes a one-liner merge command with force-suffix more bearable, for example:

 df_merged = df1.merge(df2.add_suffix('_2'))  
