I have a data frame df like this
df = pd.DataFrame([
{'Name': 'Chris', 'Item Purchased': 'Sponge', 'Cost': 22.50},
{'Name': 'Kevyn', 'Item Purchased': 'Kitty Litter', 'Cost': '.........'},
{'Name': 'Filip', 'Item Purchased': 'Spoon', 'Cost': '...'}],
index=['Store 1', 'Store 1', 'Store 2'])
I want to replace the missing values in 'Cost' columns to np.nan. So far I have tried:
df['Cost']=df['Cost'].str.replace("\.\.+", np.nan)
and
df['Cost']=re.sub('\.\.+',np.nan,df['Cost'])
but neither of them seem to work properly. Please help.
Use DataFrame.replace with the regex=True switch.
df = df.replace('\.+', np.nan, regex=True)
df
Cost Item Purchased Name
Store 1 22.5 Sponge Chris
Store 1 NaN Kitty Litter Kevyn
Store 2 NaN Spoon Filip
The pattern \.+ specifies one or more dots. You could also use [.]+ as a pattern to the same effect.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With