I'm interested in adding a text tag to a new column in a Pandas dataframe. The following example works but I get the copy warning and I don't fully understand if I should ignore it in this case.
The DataFrame simply has either a character or is an empty string:
In [1]: import pandas as pd
In [2]: df=pd.DataFrame({('A'):['x','','x',''], ('B'):['x','x','','']})
In [3]: df
Out[3]:
A B
0 x x
1 x
2 x
3
Create a new column called 'msg'
In [4]: df['msg'] = ''
In [5]: df
Out[5]:
A B msg
0 x x
1 x
2 x
3
Set the 'msg' column to 'red' if 'A' is not an empty string
In [6]: df['msg'][df['A'] != ''] = 'red;'
In [7]: df
Out[7]:
A B msg
0 x x red;
1 x
2 x red;
3
Concatenate 'blue' depending on the 'B' column values
In [8]: df['msg'][df['B'] != ''] += 'blue;'
In [9]: df
Out[9]:
A B msg
0 x x red;blue;
1 x blue;
2 x red;
3
Alternatively, I found using numpy.where produced the desired result. What is the proper way to do this in Pandas?
import numpy as np
df['msg'] += np.where(df['A'] != '','green;', '')
Update: 4/15/2018
Upon further thought, it would be useful to retain the data from the original DataFrame in certain cases but still attach a label ('color' in this example). The answer from @COLDSPEED led me to the following (changing 'blue;' to 'blue:' and preserving column 'B' data to include in the tag in this case):
df['msg'] = (v.where(df.applymap(len) > 0, '') +
df.where(df[['B']].applymap(len)>0,'')).agg(''.join, axis=1)
A B msg
0 x x red;blue:x
1 x blue:x
2 x red;
3
If you know your colors before-hand, you could use masking with DataFrame.where
and str.join
to get this done.
v = pd.DataFrame(
np.repeat([['red;', 'blue;']], len(df), axis=0),
columns=df.columns,
index=df.index
)
df['msg'] = v.where(df.applymap(len) > 0, '').agg(''.join, axis=1)
df
A B msg
0 x x red;blue;
1 x blue;
2 x red;
3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With