Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Find text using beautifulSoup then replace in original soup variable

commentary = soup.find('div', {'id' : 'live-text-commentary-wrapper'})
findtoure = commentary.find(text = re.compile('Gnegneri Toure Yaya')).replace('Gnegneri      Toure Yaya', 'Yaya Toure')

Commentary contains various instances of Gnegneri Toure Yaya that need changing to Yaya Toure.

findAll() doesn't work as findtoure is a list.

The other problem I have is this code simply finds them and replaces them into a new variable called findtoure, I need to replace them in the original soup.

I think I am just looking at this from the wrong perspective.

like image 893
user2073606 Avatar asked Feb 24 '13 21:02

user2073606


People also ask

How do you replace text in BeautifulSoup?

To replace the inner text of a tag in Beautiful Soup, use the replace_with(~) method.

What is the difference between Find_all () and find () in BeautifulSoup?

find is used for returning the result when the searched element is found on the page. find_all is used for returning all the matches after scanning the entire document.

How do you replace text in HTML using Python?

If the text and the string to replace is simple then use str. replace().


1 Answers

You cannot do what you want with just .replace(). From the BeautifulSoup documentation on NavigableString:

You can’t edit a string in place, but you can replace one string with another, using replace_with().

That's exactly what you need to do; take each match, then call .replace() on the contained text and replace the original with that:

findtoure = commentary.find_all(text = re.compile('Gnegneri Toure Yaya'))
for comment in findtoure:
    fixed_text = comment.replace('Gnegneri Toure Yaya', 'Yaya Toure')
    comment.replace_with(fixed_text)

If you want to use these comments further, you'll need to do a new find:

findtoure = commentary.find_all(text = re.compile('Yaya Toure'))

or, if you all you need is the resulting strings (so Python str objects, not NavigableString objects still connected to the BeautifulSoup object), just collect the fixed_text objects:

findtoure = commentary.find_all(text = re.compile('Gnegneri Toure Yaya'))
fixed_comments = []
for comment in findtoure:
    fixed_text = comment.replace('Gnegneri Toure Yaya', 'Yaya Toure')
    comment.replace_with(fixed_text)
    fixed_comments.append(fixed_text)
like image 167
Martijn Pieters Avatar answered Sep 29 '22 04:09

Martijn Pieters