With the code below:
soup = BeautifulSoup(page.read(), fromEncoding="utf-8") result = soup.find('div', {'class' :'flagPageTitle'})
I get the following html:
<div id="ctl00_ContentPlaceHolder1_Item65404" class="flagPageTitle" style=" "> <span></span><p>Some text here</p> </div>
How can I get Some text here
without any tags? Is there InnerText equivalent in BeautifulSoup
?
The navigablestring object is used to represent the contents of a tag. To access the contents, use “. string” with tag. You can replace the string with another string but you can't edit the existing string.
To get href with Python BeautifulSoup, we can use the find_all method. to create soup object with BeautifulSoup class called with the html string. Then we find the a elements with the href attribute returned by calling find_all with 'a' and href set to True .
The prettify() method will turn a Beautiful Soup parse tree into a nicely formatted Unicode string, with a separate line for each tag and each string: Python3.
NavigableString class is provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. A string corresponds to a bit of text within a tag.
All you need is:
result = soup.find('div', {'class' :'flagPageTitle'}).text
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With