Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas replacing values on specific columns

Tags:

python

pandas

I am aware of these two similar questions:

Pandas replace values

Pandas: Replacing column values in dataframe

I used a different approach for substituting values from which I think it should be the cleanest one. But it does not work. I know how to work around it, but I would like to understand why it does not work:

In [108]: df=pd.DataFrame([[1, 2, 8],[3, 4, 8], [5, 1, 8]], columns=['A', 'B', 'C']) 

In [109]: df
Out[109]: 
   A  B  C
0  1  2  8
1  3  4  8
2  5  1  8

In [110]: df.loc[:, ['A', 'B']].replace([1, 3, 2], [3, 6, 7], inplace=True)

In [111]: df
Out[111]: 
   A  B  C
0  1  2  8
1  3  4  8
2  5  1  8

In [112]: df.loc[:, 'A'].replace([1, 3, 2], [3, 6, 7], inplace=True)

In [113]: df
Out[113]: 
   A  B  C
0  3  2  8
1  6  4  8
2  5  1  8

If I slice only one column In [112] it works different to slicing several columns In [110]. As I understand the .loc method it returns a view and not a copy. In my logic this means that making an inplace change on the slice should change the whole DataFrame. This is what happens at line In [110].

like image 274
mcocdawc Avatar asked Jan 07 '16 10:01

mcocdawc


People also ask

How do I change a value in a specific column in pandas?

Step 3: Replace Values in Pandas DataFrame Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])

How replace values in column based on multiple conditions in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

What is inplace in pandas replace?

Replace several possible values across all columns of a dataframe. Modify a dataframe inplace (i.e., replace and modify the original dataframe) Replace a specific value in a specific dataframe column. Replace different values in multiple different columns.


1 Answers

Here is the answer by one of the developers: https://github.com/pydata/pandas/issues/11984

This should ideally show a SettingWithCopyWarning, but I think this is quite difficult to detect.

You should NEVER do this type of chained inplace setting. It is simply bad practice.

idiomatic is:

In [7]: df[['A','B']] = df[['A','B']].replace([1, 3, 2], [3, 6, 7])

In [8]: df
Out[8]: 
   A  B  C
0  3  7  8
1  6  4  8
2  5  3  8

(you can do with df.loc[:,['A','B']] as well, but more clear as above.

like image 190
mcocdawc Avatar answered Oct 25 '22 13:10

mcocdawc