I have a dataframe which has many instances of '?' in different rows. The data type of the columns is 'object'. Now I want to replace all the '?' with 0. How do I do that?
Consider the dataframe df
df = pd.DataFrame([['?', 1], [2, '?']])
print(df)
0 1
0 ? 1
1 2 ?
replace
df.replace('?', 0)
0 1
0 0 1
1 2 0
mask
or where
df.mask(df == '?', 0)
# df.where(df != '?', 0)
0 1
0 0 1
1 2 0
However, imagine your dataframe has ?
within longer strings.
df = pd.DataFrame([['a?', 1], [2, '?b']])
print(df)
0 1
0 a? 1
1 2 ?b
replace
with regex=True
df.replace('\?', '0', regex=True)
0 1
0 a0 1
1 2 0b
I think better is replace
it to string
0
, because else get mixed types - numeric with strings and some pandas function can failed:
df.replace('?', '0')
Also if need replace multiple ?
to one 0
add +
for match one or more values:
df = pd.DataFrame([['a???', '?'], ['s?', '???b']])
print(df)
0 1
0 a??? ?
1 s? ???b
df = df.replace('\?+', '0', regex=True)
print (df)
0 1
0 a0 0
1 s0 0b
df = df.replace('[?]+', '0', regex=True)
print (df)
0 1
0 a0 0
1 s0 0b
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With