I have a dataframe
df = pd.DataFrame({'a':[1,2,3], 'b':[5, '12$sell', '1$sell']})
I want to replace $sell from column b.
So I tried replace()
method like below
df['b'] = df['b'].str.replace("$sell","")
but it's doesn't replace the given string and it gives me same dataframe as original.
It's working when I use it with apply
df['b'] = df['b'].apply(lambda x: str(x).replace("$sell",""))
So I want to know why it is not working in previous case?
Note: I tried replacing only $ and shockingly it works.
You are facing this issue because you are using the replace method incorrectly. When you call the replace method on a string in python you get a new string with the contents replaced as specified in the method call. You are not storing the modified string but are just using the unmodified string.
You can replace a string in the pandas DataFrame column by using replace(), str. replace() with lambda functions.
In order to replace a value in Pandas DataFrame, use the replace() method with the column the from and to values. Below example replace Spark with PySpark value on the Course column. Notice that all the Spark values are replaced with the Pyspark values under the first column.
The method argument of fillna() can be used to replace missing values with previous/next valid values. If method is set to 'ffill' or 'pad' , missing values are replaced with previous valid values (= forward fill), and if 'bfill' or 'backfill' , replaced with the next valid values (= backward fill).
It is regex metacharacter (end of string), escape it or add parameter regex=False
:
df['b'] = df['b'].str.replace("\$sell","")
print (df)
a b
0 1 NaN
1 2 12
2 3 1
df['b'] = df['b'].str.replace("$sell","", regex=False)
If want also value 5, what is numeric, use Series.replace
with regex=True for replace substrings - numeric values are not touched:
df['b'] = df['b'].replace("\$sell","", regex=True)
print (df['b'].apply(type))
0 <class 'int'>
1 <class 'str'>
2 <class 'str'>
Name: b, dtype: object
Or cast to strings all data of column:
df['b'] = df['b'].astype(str).str.replace("$sell","", regex=False)
print (df['b'].apply(type))
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
Name: b, dtype: object
And for better performance if no missing values is possible use list comprehension:
df['b'] = [str(x).replace("$sell","") for x in df['b']]
print (df)
a b
0 1 5
1 2 12
2 3 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With