Assuming I have following data set
lst = ['u', 'v', 'w', 'x', 'y']
lst_rev = list(reversed(lst))
dct = dict(zip(lst, lst_rev))
df = pd.DataFrame({'A':['a', 'b', 'a', 'c', 'a'],
'B':lst},
dtype='category')
Now I want to replace
the value of column B in df by dct
I know I can do
df.B.map(dct).fillna(df.B)
to get the expected out put , but when I test with replace
(which is more straightforward base on my thinking ), I failed
The out put show as below
df.B.replace(dct)
Out[132]:
0 u
1 v
2 w
3 v
4 u
Name: B, dtype: object
Which is different from the
df.B.map(dct).fillna(df.B)
Out[133]:
0 y
1 x
2 w
3 v
4 u
Name: B, dtype: object
I can think that the reason why this happen, But why ?
0 u --> change to y then change to u
1 v --> change to x then change to v
2 w
3 v
4 u
Appreciate your help.
Python loc () method can also be used to update the value of a row with respect to columns by providing the labels of the columns and the index of the rows.
Given two dictionaries, update the values from other dictionary if key is present in other dictionary. Explanation : “Geeks” and “Best” values updated to 10 and 17.
1 Using Python at () method to update the value of a row. ... 2 Python loc () function to change the value of a row/column. ... 3 Python replace () method to update values in a dataframe. Using Python replace () method, we can update or change the value of any string within a data frame. 4 Using iloc () method to update the value of a row. ...
Quite often, you may need to do more than one replacement in the same cell. Of course, you could do one replacement, output an intermediate result into an additional column, and then use the REPLACE function again.
It's because replace
keeps applying the dictionary
df.B.replace({'u': 'v', 'v': 'w', 'w': 'x', 'x': 'y', 'y': 'Hello'})
0 Hello
1 Hello
2 Hello
3 Hello
4 Hello
Name: B, dtype: object
With the given dct
'u'
-> 'y'
then 'y'
-> 'u'
.
This behavior is not intended, and was recognized as a bug.
This is the Github issue that first identified the behavior, and it was added as a milestone for pandas 0.24.0
. I can confirm the replacement works as expected in the current version on Github.
Here is the PR containing the fix.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With