My dataframe has four columns with colors. I want to combine them into one column called "Colors" and use commas to separate the values.
For example, I'm trying to combine into a Colors column like this :
ID Black Red Blue Green Colors
120 NaN red NaN green red, green
121 black Nan blue NaN black, blue
My code is:
df['Colors'] = df[['Black, 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x), axis=1)
But the output for ID 120 is:
, red, , green
And the output for ID 121 is:
black, , blue,
FOUND MY PROBLEM!
Earlier in my code, I replaced "None"
with " "
instead of NaN
. Upon making the change, plus incorporating feedback to insert [x.notnull()]
, it works!
df['Black'].replace('None', np.nan, inplace=True)
df['Colors'] = df[['Black, 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x[x.notnull()]), axis=1)
You just need to handle NaNs
df['Colors'] = df[['Black', 'Red', 'Blue', 'Green']].apply(lambda x: ', '.join(x[x.notnull()]), axis = 1)
ID Black Red Blue Green Colors
0 120 NaN red NaN green red, green
1 121 black NaN blue NaN black, blue
Using dot
s=df.iloc[:,1:]
s.notnull()
Black Red Blue Green
0 False True False True
1 True True True False
s.notnull().dot(s.columns+',').str[:-1]
0 Red,Green
1 Black,Red,Blue
dtype: object
df['color']=s.notnull().dot(s.columns+',').str[:-1]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With