I need to rewrite some fields of a column in a DataFrame, based on a condition.
I have this dataframe:
Vitamins Sign
0 B -
1 C +
2 A NaN
3 Z 2
4 E +
5 I Expired
6 D + Severe Cases
7 K Expired+ Last Year
8 J +New
And I need to rewrite both columns, based on this condition:
'Sign'
contains the sign '+'
then it should be copied and pasted on the 'Vitamins'
column, on the same row, without any spaces between the last word and the sign. Then, the sign '+'
from the 'Sign'
column (that field) should be removed.The result is this dataframe:
Vitamins Sign
0 B -
1 C+ NaN
2 A NaN
3 Z 2
4 E+ NaN
5 I Expired
6 D Severe Cases
7 K+ Expired Last Year
8 J+ New
I wrote this code for this:
import pandas as pd
import numpy as np
data = {'Vitamins': ['B', 'C', 'A', 'Z', 'E', 'I', 'D', 'K', 'J'],
'Sign': ['-', '+', np.nan, 2, '+', 'Expired', '+ Severe Cases', 'Expired+ Last Year', '+New']}
df = pd.DataFrame (data, columns = ['Vitamins', 'Sign'])
mask = (df.loc[:, 'Sign'].str.contains('+', na=False, regex = False))
df['Vitamins'] = str(df.loc[mask, 'Vitamins']) + '+'
df['Sign'] = df.loc[mask, 'Sign'].str.replace('+', '')
But unfortunately it does not do what it is needed.
How can this be resolved?
Lots of Thank You in Advanced!
Use numpy.where
:
In [1552]: import numpy as np
In [1553]: df['Vitamins'] = np.where(df['Sign'].str.contains('+', na=False, regex = False), df['Vitamins'] + '+', df['Vitamins'])
In [1557]: df['Sign'] = df['Sign'].replace('+', np.nan).replace('\+', '', regex=True)
In [1558]: df
Out[1558]:
Vitamins Sign
0 B -
1 C+ NaN
2 A NaN
3 Z 2
4 E+ NaN
5 I Expired
6 D+ Severe Cases
7 K+ Expired Last Year
8 J+ New
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With