Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I replace all the instances of a certain character in a dataframe?

I have a dataframe which has many instances of '?' in different rows. The data type of the columns is 'object'. Now I want to replace all the '?' with 0. How do I do that?

like image 551
user517696 Avatar asked Dec 18 '22 07:12

user517696


2 Answers

Consider the dataframe df

df = pd.DataFrame([['?', 1], [2, '?']])

print(df)

   0  1
0  ?  1
1  2  ?

replace

df.replace('?', 0)

   0  1
0  0  1
1  2  0

mask or where

df.mask(df == '?', 0)
# df.where(df != '?', 0)

   0  1
0  0  1
1  2  0

However, imagine your dataframe has ? within longer strings.

df = pd.DataFrame([['a?', 1], [2, '?b']])

print(df)

    0   1
0  a?   1
1   2  ?b

replace with regex=True

df.replace('\?', '0', regex=True)

    0   1
0  a0   1
1   2  0b
like image 74
piRSquared Avatar answered Dec 21 '22 09:12

piRSquared


I think better is replace it to string 0, because else get mixed types - numeric with strings and some pandas function can failed:

df.replace('?', '0')

Also if need replace multiple ? to one 0 add + for match one or more values:

df = pd.DataFrame([['a???', '?'], ['s?', '???b']])
print(df)
      0     1
0  a???     ?
1    s?  ???b

df = df.replace('\?+', '0', regex=True)
print (df)
    0   1
0  a0   0
1  s0  0b

df = df.replace('[?]+', '0', regex=True)
print (df)
    0   1
0  a0   0
1  s0  0b
like image 39
jezrael Avatar answered Dec 21 '22 09:12

jezrael