Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Replace Whole Values in Dataframe String and Not Substrings

I am trying to replace strings in a dataframe if the whole string equals another string. I do not want to replace substrings.

So:

If I have df:

 Index  Name       Age
   0     Joe        8
   1     Mary       10
   2     Marybeth   11

and I want to replace "Mary" when the whole string matches "Mary" with "Amy" so I get

 Index  Name       Age
   0     Joe        8
   1     Amy        10
   2     Marybeth   11

I'm doing the following:

df['Name'] = df['Name'].apply(lambda x: x.replace('Mary','Amy'))

My understanding from searching around is that the defaults of replace set regex=False and replace should look for the whole value in the dataframe to be "Mary". Instead I'm getting this result:

 Index  Name       Age
   0     Joe        8
   1     Amy        10
   2     Amybeth   11

What am I doing wrong?

like image 208
Windstorm1981 Avatar asked Jan 11 '18 19:01

Windstorm1981


People also ask

How do I replace a value in an entire data frame?

Pandas DataFrame replace() Method The replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.

How do you replace a string in a whole DataFrame in Python?

You can replace a string in the pandas DataFrame column by using replace(), str. replace() with lambda functions.

How do you replace all values in a string in Python?

Python String | replace() replace() is an inbuilt function in the Python programming language that returns a copy of the string where all occurrences of a substring are replaced with another substring. Parameters : old – old substring you want to replace. new – new substring which would replace the old substring.

How do you replace a section of a string in a data frame?

You can replace substring of pandas DataFrame column by using DataFrame. replace() method. This method by default finds the exact sting match and replaces it with the specified value. Use regex=True to replace substring.


1 Answers

replace + dict is the way to go (With DataFrame, you are using Series.str.replace)

df['Name'].replace({'Mary':'Amy'})
Out[582]: 
0         Joe
1         Amy
2    Marybeth
Name: Name, dtype: object
df['Name'].replace({'Mary':'Amy'},regex=True)
Out[583]: 
0        Joe
1        Amy
2    Amybeth
Name: Name, dtype: object

Notice they are different

Series: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.replace.html

DataFrame: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.replace.html

like image 164
BENY Avatar answered Sep 23 '22 20:09

BENY