I have two data frames:
df
:
id string_data
1 My name is Jeff
2 Hello, I am John
3 I like Brad he is cool.
Another data frame named allnames
contains a list of names like this:
id name
1 Jeff
2 Brad
3 John
4 Emily
5 Ross
I want to replace all the words in df
that appear in allnames['name']
with "Firstname"
Expected output:
id string_data
1 My name is Firstname
2 Hello, I am Firstname
3 I like Firstname he is cool.
I tried this:
nameList = '|'.join(allnames['name'])
df['string_data'].str.replace(nameList, "FirstName", case = False))
But it replaces almost 99% of the words
Depending on your needs, you may use either of the following methods to replace values in Pandas DataFrame: (1) Replace a single value with a new value for an individual DataFrame column: df ['column name'] = df ['column name']. (2) Replace multiple values with a new value for an individual ...
Python / October 5, 2020. Here are two ways to replace characters in strings in Pandas DataFrame: (1) Replace character/s under a single DataFrame column: df ['column name'] = df ['column name'].str.replace ('old character','new character') (2) Replace character/s under the entire DataFrame: df = df.replace ('old character','new character', ...
So to replace values from another DataFrame when different indices we can use: Now the values are correctly set: You can use Pandas merge function in order to get values and columns from another DataFrame. For this purpose you will need to have reference column between both DataFrames or use the index.
It can be done using the DataFrame.replace () method. It is used to replace a regex, string, list, series, number, dictionary, etc. from a DataFrame, Values of the DataFrame method are get replaced with another value dynamically.
Your solution should working if add words boundaries to Series.str.replace
:
nameList = '|'.join(r"\b{}\b".format(x) for x in allnames['name'])
df['string_data'] = df['string_data'].str.replace(nameList, "FirstName", case = False)
print (df)
id string_data
0 1 My name is FirstName
1 2 Hello, I am FirstName
2 3 I like FirstName he is cool.
Or replace values with get
and join
by dictionary:
d = dict.fromkeys(allnames['name'], 'Firstname')
f = lambda x: ' '.join(d.get(y, y) for y in x.split())
df['string_data'] = df['string_data'].apply(f)
print (df)
id string_data
0 1 My name is Firstname
1 2 Hello, I am Firstname
2 3 I like Firstname he is cool.
EDIT: You can convert all values to lowercase by lower
:
d = dict.fromkeys([x.lower() for x in allnames['name']], 'Firstname')
f = lambda x: ' '.join(d.get(y.lower(), y) for y in x.split())
df['string_data'] = df['string_data'].apply(f)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With