Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing values in multiple specific columns of a Dataframe

I have the following dataframe:

import pandas as pd
import numpy as np

raw_data = {
    'Score1': [42, 52, -999, 24, 73], 
    'Score2': [-999, -999, -999, 2, 1],
    'Score3': [2, 2, -999, 2, -999]}
df = pd.DataFrame(raw_data, columns = ['Score1', 'Score2', 'Score3'])

and I want to replace the -999's with NaN only in columns Score2 and Score3, leaving column Score1 unchanged. I want to reference the columns to be modified by name and they might not be consecutive.

I have tried things like:

df.loc[:,('Score2', 'Score3')].replace(-999, np.nan, inplace=True)
df

but this does not work, I assume because it is operating on the copy. Is there a way of doing this in one statement?

I looked at Pandas replacing values on specific columns but found it quite confusing, so felt a simpler example would help.

like image 822
panda Avatar asked Dec 24 '22 05:12

panda


2 Answers

You cannnot use inplace=True, because subset returns a Series which may have its data as a view. Modifying it in place doesn't ALWAYS propogate it back to the parent object. That's why SettingWithCopyWarning is possible there (or raise if you set the option). You should never do this, nor is their ever a reason to do so.

df[['Score2', 'Score3']] = df[['Score2', 'Score3']].replace(-999, np.nan)
print (df)
   Score1  Score2  Score3
0      42     NaN     2.0
1      52     NaN     2.0
2    -999     NaN     NaN
3      24     2.0     2.0
4      73     1.0     NaN
like image 71
jezrael Avatar answered Dec 25 '22 18:12

jezrael


Use

In [282]: df.replace({'Score2': -999, 'Score3': -999}, np.nan)
Out[282]:
   Score1  Score2  Score3
0      42     NaN     2.0
1      52     NaN     2.0
2    -999     NaN     NaN
3      24     2.0     2.0
4      73     1.0     NaN
like image 43
Zero Avatar answered Dec 25 '22 19:12

Zero