I want to replace all strings that contain a specific substring. So for example if I have this dataframe:
import pandas as pd df = pd.DataFrame({'name': ['Bob', 'Jane', 'Alice'], 'sport': ['tennis', 'football', 'basketball']})
I could replace football with the string 'ball sport' like this:
df.replace({'sport': {'football': 'ball sport'}})
What I want though is to replace everything that contains ball
(in this case football
and basketball
) with 'ball sport'. Something like this:
df.replace({'sport': {'[strings that contain ball]': 'ball sport'}})
You can replace substring of pandas DataFrame column by using DataFrame. replace() method. This method by default finds the exact sting match and replaces it with the specified value. Use regex=True to replace substring.
The easiest way to replace all occurrences of a given substring in a string is to use the replace() function.
Using “contains” to Find a Substring in a Pandas DataFrame The contains method returns boolean values for the Series with True for if the original Series value contains the substring and False if not. A basic application of contains should look like Series. str. contains("substring") .
Pandas replace multiple values in column replace. By using DataFrame. replace() method we will replace multiple values with multiple new strings or text for an individual DataFrame column.
You can use str.contains
to mask the rows that contain 'ball' and then overwrite with the new value:
In [71]: df.loc[df['sport'].str.contains('ball'), 'sport'] = 'ball sport' df Out[71]: name sport 0 Bob tennis 1 Jane ball sport 2 Alice ball sport
To make it case-insensitive pass `case=False:
df.loc[df['sport'].str.contains('ball', case=False), 'sport'] = 'ball sport'
You can use apply
with a lambda. The x
parameter of the lambda function will be each value in the 'sport' column:
df.sport = df.sport.apply(lambda x: 'ball sport' if 'ball' in x else x)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With