I am trying to update a column based on condition of another column
df=pd.DataFrame(np.random.randn(6,4),columns=list('abcd'))
df[df.b>0].d=1
why doesnt this work? without the condition it works.
When I do this with pandas v0.16.1, I get a warning telling me what's happening:
df=pd.DataFrame(np.random.randn(6,4),columns=list('abcd'))
df[df.b>0].d=1
/home/me/.local/lib/python2.7/site-packages/pandas/core/generic.py:1974: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
df[df.b > 0]
creates a copy of those rows of the dataframe that is no longer linked to the original dataframe. Following the suggestions in the warning, if I do:
df.loc[df.b > 0, 'd'] = 1
I get the desired results:
df
Out[10]:
a b c d
0 -0.127010 0.252527 -0.857680 1.000000
1 0.348888 0.780728 -0.710778 1.000000
2 0.840746 -0.456552 0.414482 -1.326191
3 0.864530 0.365728 -0.540530 1.000000
4 1.954639 -0.919998 -0.446927 1.949182
5 -0.928344 -0.145271 0.089434 -0.569934
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With