Given this data frame:
>>> a = pd.DataFrame(data={'words':['w1','w2','w3','w4','w5'],'value':np.random.rand(5)})
>>> a
value words
0 0.157876 w1
1 0.784586 w2
2 0.875567 w3
3 0.649377 w4
4 0.852453 w5
>>> b = pd.Series(data=['w3','w4'])
>>> b
0 w3
1 w4
I'd like to replace the elements of value
with zero
but only for the words that match those in b
.
The resulting data frame should therefore look like this:
value words
0 0.157876 w1
1 0.784586 w2
2 0 w3
3 0 w4
4 0.852453 w5
I thought of something along these lines: a.value[a.words==b] = 0
but it's obviously wrong.
You're close, just use pandas.Series.isin() instead of ==
:
>>> a.value[a['words'].isin(b)] = 0
>>> a
value words
0 0.340138 w1
1 0.533770 w2
2 0.000000 w3
3 0.000000 w4
4 0.002314 w5
Or you can use ix
selector:
>>> a.ix[a['words'].isin(b), 'value'] = 0
>>> a
value words
0 0.340138 w1
1 0.533770 w2
2 0.000000 w3
3 0.000000 w4
4 0.002314 w5
update You can see documentation about differences betweed .ix
and .loc
, some quotes:
.loc is strictly label based, will raise KeyError when the items are not found ...
.iloc is strictly integer position based (from 0 to length-1 of the axis), will raise IndexError when the requested indicies are out of bounds ...
.ix supports mixed integer and label based access. It is primarily label based, but will fallback to integer positional access. .ix is the most general and will support any of the inputs to .loc and .iloc, as well as support for floating point label schemes. .ix is especially useful when dealing with mixed positional and label based hierarchial indexes ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With