I have two columns as below:
id, colA, colB
0, a, 13
1, a, 52
2, b, 16
3, a, 34
4, b, 946
etc...
I am trying to create a third column, colC
, that is colB
if colA == a
, otherwise 0
.
This is what I was thinking, but it does not work:
data[data['colA']=='a']['colC'] = data[data['colA']=='a']['colB']
I was also thinking about using np.where()
, but I don't think that would work here.
Any thoughts?
Using apply() method If you need to apply a method over an existing column in order to compute some values that will eventually be added as a new column in the existing DataFrame, then pandas. DataFrame. apply() method should do the trick.
You can extract a column of pandas DataFrame based on another value by using the DataFrame. query() method. The query() is used to query the columns of a DataFrame with a boolean expression. The blow example returns a Courses column where the Fee column value matches with 25000.
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Use loc
with a mask to assign:
In [300]:
df.loc[df['colA'] == 'a', 'colC'] = df['colB']
df['colC'] = df['colC'].fillna(0)
df
Out[300]:
id colA colB colC
0 0 a 13 13
1 1 a 52 52
2 2 b 16 0
3 3 a 34 34
4 4 b 946 0
EDIT
or use np.where
:
In [296]:
df['colC'] = np.where(df['colA'] == 'a', df['colC'],0)
df
Out[296]:
id colA colB colC
0 0 a 13 13
1 1 a 52 52
2 2 b 16 0
3 3 a 34 34
4 4 b 946 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With