Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace values in a series pandas [duplicate]

How come when i want to replace a value I have to use this block of code:

data['Organization'].str.replace('Greece','Rome')

why cant I use this:

data['Organization'].replace('Greece','Rome').

I've seen others use method two before without passing a string method. My question is can i pass a series method using replace function and what is the line of code?

like image 737
grim_reaper Avatar asked Jan 27 '23 12:01

grim_reaper


2 Answers

pd.Series.replace is different to pd.Series.str.replace:

  • pd.Series.replace is used to replace an element in its entirety. It will work also on non-string elements.
  • pd.Series.str.replace is used to replace substrings, optionally using regex.

Here's a minimal example demonstrating the difference:

df = pd.DataFrame({'A': ['foo', 'fuz', np.nan]})

df['B'] = df['A'].replace(['foo', 'fuz'], ['food', 'fuzzy'])
df['C'] = df['A'].str.replace('f.', 'ba', regex=True)

print(df)

     A      B    C
0  foo   food  bao
1  fuz  fuzzy  baz
2  NaN    NaN  NaN
like image 82
jpp Avatar answered Jan 30 '23 03:01

jpp


str.replace by default does a regex based replacement which also works with partial matches. replace, OTOH, will only perform replacements based on full matches by default unless the regex flag is set to true.

data['Organization'] = (
    data['Organization'].replace({'Greece': 'Rome'}, regex=True))
like image 25
cs95 Avatar answered Jan 30 '23 03:01

cs95