I have a dataframe like this
df = (pd.DataFrame({'ID': ['ID1', 'ID2', 'ID3'],
'colA': ['A', 'B', 'C'],
'colB': ['D', np.nan, 'E']}))
df
ID colA colB
0 ID1 A D
1 ID2 B NaN
2 ID3 C E
I want to combine the two columns, however keep only column A if column B is NaN. Hence Expected output is
ID colA colB colC
0 ID1 A D A_D
1 ID2 B NaN B
2 ID3 C E C_E
Idea is add _
to second column with _
, so after replace missing value by empty string is not added _
for missing values:
df['colC'] = df['colA'] + ('_' + df['colB']).fillna('')
print (df)
ID colA colB colC
0 ID1 A D A_D
1 ID2 B NaN B
2 ID3 C E C_E
If not sure where are missing values (in colA
or colB
):
df['colC'] = (df['colA'].fillna('') + '_' + df['colB'].fillna('')).str.strip('_')
Also is possible test each column separately:
m1 = df['colA'].isna()
m2 = df['colB'].isna()
df['colC'] = np.select([m1, m2, m1 & m2],
[df['colB'], df['colA'], np.nan],
default=df['colA'] + '_' + df['colB'])
print (df)
ID colA colB colC
0 ID1 A D A_D
1 ID2 B NaN B
2 ID3 NaN E E
3 ID4 NaN NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With