I am aware of these two similar questions:
Pandas replace values
Pandas: Replacing column values in dataframe
I used a different approach for substituting values from which I think it should be the cleanest one. But it does not work. I know how to work around it, but I would like to understand why it does not work:
In [108]: df=pd.DataFrame([[1, 2, 8],[3, 4, 8], [5, 1, 8]], columns=['A', 'B', 'C']) 
In [109]: df
Out[109]: 
   A  B  C
0  1  2  8
1  3  4  8
2  5  1  8
In [110]: df.loc[:, ['A', 'B']].replace([1, 3, 2], [3, 6, 7], inplace=True)
In [111]: df
Out[111]: 
   A  B  C
0  1  2  8
1  3  4  8
2  5  1  8
In [112]: df.loc[:, 'A'].replace([1, 3, 2], [3, 6, 7], inplace=True)
In [113]: df
Out[113]: 
   A  B  C
0  3  2  8
1  6  4  8
2  5  1  8
If I slice only one column In [112] it works different to slicing several columns In [110]. As I understand the .loc method it returns a view and not a copy. In my logic this means that making an inplace change on the slice should change the whole DataFrame. This is what happens at line In [110].
Step 3: Replace Values in Pandas DataFrame Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Replace several possible values across all columns of a dataframe. Modify a dataframe inplace (i.e., replace and modify the original dataframe) Replace a specific value in a specific dataframe column. Replace different values in multiple different columns.
Here is the answer by one of the developers: https://github.com/pydata/pandas/issues/11984
This should ideally show a SettingWithCopyWarning, but I think this is quite difficult to detect.
You should NEVER do this type of chained inplace setting. It is simply bad practice.
idiomatic is:
In [7]: df[['A','B']] = df[['A','B']].replace([1, 3, 2], [3, 6, 7]) In [8]: df Out[8]: A B C 0 3 7 8 1 6 4 8 2 5 3 8(you can do with
df.loc[:,['A','B']]as well, but more clear as above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With