Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I conditionally update multiple columns in a panda dataframe

I'm trying to conditionally update multiple rows in my panda dataframe. Here's my data:

df = pd.DataFrame([[1,1,1], [2,2,2], [3,3,3]], columns=list('ABC'))

I can do the update I want in two steps:

df.loc[df['A'] == 1, 'B'] = df['C'] +10
df.loc[df['A'] == 1, 'A'] = df['C'] +11

Or I can update to constant values in one step:

df.loc[df['A'] == 1, ['A', 'B']] = [11, 12]

But I can't update multiple columns from other columns in a single step:

df.loc[df['A'] == 1, ['A', 'B']] = [df['C'] + 10, df['C'] + 11]
...
ValueError: shape mismatch: value array of shape (2,3) could not be broadcast to indexing result of shape (1,2)

Any ideas how I can do this?


Edit: Thanks @EdChum for the simple solution for the simple case - have updated the question to demonstrate a more complex reality.

like image 345
Matthew Avatar asked Jun 07 '16 09:06

Matthew


People also ask

How do I replace multiple columns in pandas?

To replace multiple values in a DataFrame, you can use DataFrame. replace() method with a dictionary of different replacements passed as argument.

How replace value in pandas based on multiple conditions?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.


1 Answers

So looking at this question a couple years later I see the error, to coerce the returned result so it assigns correctly you need to access the scalar values and use these to assign so they align as desired:

In [22]:
df.loc[df['A'] == 1, ['A', 'B']] = df['C'].values[0] + 10,df['C'].values[0] + 11
df

Out[22]:
    A   B  C
0  11  12  1
1   2   2  2
2   3   3  3
like image 59
EdChum Avatar answered Nov 13 '22 09:11

EdChum