Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I join 2 columns of a Pandas Data Frame by a comma?

Tags:

python

pandas

I'd like to join 2 columns of a Pandas Data Frame with a comma, i.e.: "abc" in column 1 joins with "123" in column 2 to become "abc, 123".

For example:

>>> import pandas as pd
>>> import numpy as np
>>> df = pd.DataFrame({'IDx': ['a','b',np.nan,'C'], 'IDy':['1','','2','D']})
>>> df
   IDx  IDy
0    a    1
1    b     
2  NaN    2
3    C    D

The following do not work:

>>> ', '.join([df['IDx'],df['IDy']])
>>> df.apply(lambda x: ', '.join([x['IDx'],x['IDy']]))

This is the desired result:

>>> df = pd.DataFrame({'ID': ['a, 1', 'b', '2', 'C, D']})
>>> df
     ID
0  a, 1
1     b
2     2
3  C, D
like image 696
nathanielng Avatar asked Feb 08 '23 21:02

nathanielng


1 Answers

You can use apply with fillna to empty string, map columns to string and strip:

df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('').map(str)), axis=1)
df['ID'] = df['ID'].str.strip(',')
print df
   IDx IDy   ID
0    a   1  a,1
1    b        b
2  NaN   2    2
3    C   D  C,D

Or fillna to empty string and astype to string and strip:

df['ID'] = df['IDx'].fillna('').astype(str) + ',' + df['IDy'].fillna('').astype(str)
df['ID'] = df['ID'].str.strip(',')
print df
   IDx IDy   ID
0    a   1  a,1
1    b        b
2  NaN   2    2
3    C   D  C,D

EDIT: If dtype of your columns is string, you can omit map or astype:

df['ID'] = df[['IDx', 'IDy']].apply(lambda x: ','.join(x.fillna('')), axis=1)
df['ID'] = df['ID'].str.strip(',')

Or:

df['ID'] = df['IDx'].fillna('') + ',' + df['IDy'].fillna('')
df['ID'] = df['ID'].str.strip(',')
print df
like image 173
jezrael Avatar answered Feb 13 '23 03:02

jezrael