Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas how can 'replace' work after 'loc'?

Tags:

python

pandas

I have tried many times, but seems the 'replace' can NOT work well after use 'loc'. For example I want to replace the 'conlumn_b' with an regex for the row that the 'conlumn_a' value is 'apple'.

Here is my sample code :

df.loc[df['conlumn_a'] == 'apple', 'conlumn_b'].replace(r'^11*', 'XXX',inplace=True, regex=True)

Example:

conlumn_a       conlumn_b
apple           123
banana          11
apple           11
orange          33

The result that I expected for the 'df' is:

conlumn_a       conlumn_b
apple           123
banana          11
apple           XXX
orange          33

Anyone has meet this issue that needs 'replace' with regex after 'loc' ?

OR you guys has some other good solutions ?

Thank you so much for your help!

like image 758
Jonathan Zhou Avatar asked Jan 18 '18 06:01

Jonathan Zhou


People also ask

How do you change a value in a DataFrame using loc?

loc to change values in a DataFrame column based on a condition. Call pandas. DataFrame. loc [condition, column_label] = new_value to change the value in the column named column_name to value in each row for which condition is True .

Is ILOC () and loc () functions are same?

loc and iloc are interchangeable when the labels of the DataFrame are 0-based integers.

Can I use ILOC and loc together?

In this case, loc and iloc are interchangeable when selecting via a single value or a list of values. Note that loc and iloc will return different results when selecting via slice and conditions.


1 Answers

inplace=True works on the object that it was applied on.

When you call .loc, you're slicing your dataframe object to return a new one.

>>> id(df)
4587248608

And,

>>> id(df.loc[df['conlumn_a'] == 'apple', 'conlumn_b'])
4767716968

Now, calling an in-place replace on this new slice will apply the replace operation, updating the new slice itself, and not the original.


Now, note that you're calling replace on a column of int, and nothing is going to happen, because regular expressions work on strings.

Here's what I offer you as a workaround. Don't use regex at all.

m = df['conlumn_a'] == 'apple'
df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b'].replace(11, 'XXX')

df

  conlumn_a conlumn_b
0     apple       123
1    banana        11
2     apple       XXX
3    orange        33

Or, if you need regex based substitution, then -

df.loc[m, 'conlumn_b'] = df.loc[m, 'conlumn_b']\
           .astype(str).replace('^11$', 'XXX', regex=True)

Although, this converts your column to an object column.

like image 53
cs95 Avatar answered Sep 21 '22 20:09

cs95