Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

beautifulsoup find text with and without regex

The html:

<td>some key
</td>

find without regex:

soup.find(text='some key')

returned None

find with regex

soup.find(text=re.compile('some key'))

returned the td node.

Would anyone point out the difference between the two approaches? "some key" is a literal string without special characters. I noted that there's a carriage return at the end of "some key" that </td> appears on the next line.

Thank you.

like image 945
Candy Chiu Avatar asked Feb 02 '23 21:02

Candy Chiu


1 Answers

Beautifulsoup uses == to match the content between tags and the search string. Since 'some key\r\n' != 'some key', the search failed.

like image 156
Candy Chiu Avatar answered Mar 06 '23 12:03

Candy Chiu