I'm trying to conditionally update multiple rows in my panda dataframe. Here's my data:
df = pd.DataFrame([[1,1,1], [2,2,2], [3,3,3]], columns=list('ABC'))
I can do the update I want in two steps:
df.loc[df['A'] == 1, 'B'] = df['C'] +10
df.loc[df['A'] == 1, 'A'] = df['C'] +11
Or I can update to constant values in one step:
df.loc[df['A'] == 1, ['A', 'B']] = [11, 12]
But I can't update multiple columns from other columns in a single step:
df.loc[df['A'] == 1, ['A', 'B']] = [df['C'] + 10, df['C'] + 11]
...
ValueError: shape mismatch: value array of shape (2,3) could not be broadcast to indexing result of shape (1,2)
Any ideas how I can do this?
Edit: Thanks @EdChum for the simple solution for the simple case - have updated the question to demonstrate a more complex reality.
To replace multiple values in a DataFrame, you can use DataFrame. replace() method with a dictionary of different replacements passed as argument.
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
So looking at this question a couple years later I see the error, to coerce the returned result so it assigns correctly you need to access the scalar values and use these to assign so they align as desired:
In [22]:
df.loc[df['A'] == 1, ['A', 'B']] = df['C'].values[0] + 10,df['C'].values[0] + 11
df
Out[22]:
A B C
0 11 12 1
1 2 2 2
2 3 3 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With