Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing certain values in multiple columns of a pandas DataFrame at once

Tags:

Suppose I have the following DataFrame:

In [1]: df Out[1]:   apple banana cherry 0     0      3   good 1     1      4    bad 2     2      5   good 

This works as expected:

In [2]: df['apple'][df.cherry == 'bad'] = np.nan In [3]: df Out[3]:   apple banana cherry 0     0      3   good 1   NaN      4    bad 2     2      5   good 

But this doesn't:

In [2]: df[['apple', 'banana']][df.cherry == 'bad'] = np.nan In [3]: df Out[3]:   apple banana cherry 0     0      3   good 1     1      4    bad 2     2      5   good 

Why? How can I achieve the conversion of both the 'apple' and 'banana' values without having to write out two lines, as in

In [2]: df['apple'][df.cherry == 'bad'] = np.nan In [3]: df['banana'][df.cherry == 'bad'] = np.nan 
like image 572
dbliss Avatar asked Nov 08 '13 20:11

dbliss


People also ask

How do I replace values in multiple columns in pandas?

Pandas replace multiple values in column replace. By using DataFrame. replace() method we will replace multiple values with multiple new strings or text for an individual DataFrame column. This method searches the entire Pandas DataFrame and replaces every specified value.


1 Answers

You should use loc and do this without chaining:

In [11]: df.loc[df.cherry == 'bad', ['apple', 'banana']] = np.nan  In [12]: df Out[12]:     apple  banana cherry 0      0       3   good 1    NaN     NaN    bad 2      2       5   good 

See the docs on returning a view vs a copy, if you chain the assignment is made to the copy (and thrown away) but if you do it in one loc then pandas cleverly realises you want to assign to the original.

like image 103
Andy Hayden Avatar answered Oct 15 '22 04:10

Andy Hayden