Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate two columns in pandas with NaN

I have a dataframe like this

df = (pd.DataFrame({'ID': ['ID1', 'ID2', 'ID3'], 
                        'colA': ['A', 'B', 'C'], 
                        'colB': ['D', np.nan, 'E']}))

df

    ID  colA   colB
0   ID1 A      D
1   ID2 B      NaN
2   ID3 C      E

I want to combine the two columns, however keep only column A if column B is NaN. Hence Expected output is

    ID  colA    colB    colC
0   ID1 A       D       A_D
1   ID2 B       NaN     B
2   ID3 C       E       C_E
like image 239
Hardik Gupta Avatar asked Dec 23 '22 22:12

Hardik Gupta


1 Answers

Idea is add _ to second column with _, so after replace missing value by empty string is not added _ for missing values:

df['colC'] = df['colA'] + ('_' + df['colB']).fillna('')
print (df)
    ID colA colB colC
0  ID1    A    D  A_D
1  ID2    B  NaN    B
2  ID3    C    E  C_E

If not sure where are missing values (in colA or colB):

df['colC'] = (df['colA'].fillna('') + '_' + df['colB'].fillna('')).str.strip('_')

Also is possible test each column separately:

m1 = df['colA'].isna()
m2 = df['colB'].isna()

df['colC'] = np.select([m1, m2, m1 & m2], 
                        [df['colB'], df['colA'], np.nan], 
                        default=df['colA'] + '_' + df['colB'])
print (df)

    ID colA colB colC
0  ID1    A    D  A_D
1  ID2    B  NaN    B
2  ID3  NaN    E    E
3  ID4  NaN  NaN  NaN
like image 112
jezrael Avatar answered Dec 25 '22 12:12

jezrael