I'm trying to concatenate Pandas DataFrame
columns with NaN values.
In [96]:df = pd.DataFrame({'col1' : ["1","1","2","2","3","3"],
'col2' : ["p1","p2","p1",np.nan,"p2",np.nan], 'col3' : ["A","B","C","D","E","F"]})
In [97]: df
Out[97]:
col1 col2 col3
0 1 p1 A
1 1 p2 B
2 2 p1 C
3 2 NaN D
4 3 p2 E
5 3 NaN F
In [98]: df['concatenated'] = df['col2'] +','+ df['col3']
In [99]: df
Out[99]:
col1 col2 col3 concatenated
0 1 p1 A p1,A
1 1 p2 B p2,B
2 2 p1 C p1,C
3 2 NaN D NaN
4 3 p2 E p2,E
5 3 NaN F NaN
Instead of 'NaN' values in "concatenated" column, I want to get "D" and "F" respectively for this example?
We can use stack
which will drop the NaN
, then use groupby.agg
and ','.join
the strings:
df['concatenated'] = df[['col2', 'col3']].stack().groupby(level=0).agg(','.join)
col1 col2 col3 concatenated
0 1 p1 A p1,A
1 1 p2 B p2,B
2 2 p1 C p1,C
3 2 NaN D D
4 3 p2 E p2,E
5 3 NaN F F
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With