I am looking to create a new column in a Pandas data frame with the value of a list filtered by the df row value.
df = pd.DataFrame({'Index': [0,1,3,2], 'OtherColumn': ['a', 'b', 'c', 'd']})
Index OtherColumn
0 a
1 b
3 c
2 d
l = [1000, 1001, 1002, 1003]
Desired output:
Index OtherColumn Value
0 a -
1 b -
3 c 1003
2 d -
My code:
df.loc[df.OtherColumn == 'c', 'Value'] = l[df.Index]
Which returns an error since 'df.Index' is not recognised as a int but as a list (not filter by OtherColumn == 'c').
For R users, I'm looking for:
df[OtherColumn == 'c', Value := l[Index]]
Thanks.
Convert list to numpy array for indexing and then filter by mask in both sides:
m = df.OtherColumn == 'c'
df.loc[m, 'Value'] = np.array(l)[df.Index][m]
print (df)
Index OtherColumn Value
0 0 a NaN
1 1 b NaN
2 3 c 1003.0
3 2 d NaN
Or use numpy.where
:
m = df.OtherColumn == 'c'
df['Value'] = np.where(m, np.array(l)[df.Index], '-')
print (df)
Index OtherColumn Value
0 0 a -
1 1 b -
2 3 c 1003
3 2 d -
Or:
df['value'] = np.where(m, df['Index'].map(dict(enumerate(l))), '-')
Use Series.where
+ Series.map
:
df['value']=df['Index'].map(dict(enumerate(l))).where(df['OtherColumn']=='c','-')
print(df)
Index OtherColumn value
0 0 a -
1 1 b -
2 3 c 1003
3 2 d -
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With