I have the following bs4 object listing:
>>> listing <div class="listingHeader"> <h2> .... >>> type(listing) <class 'bs4.element.Tag'>
I want to extract the raw html as a string. I've tried:
>>> a = listing.contents >>> type(a) <type 'list'>
So this does not work. How can I do this?
Using a parser you are comfortable with It's fairly easy to crawl through the web pages using BeautifulSoup. To get all the HTML tags of a web page using the BeautifulSoup library first import BeautifulSoup and requests library to make a GET request to the web page. Step-by-step Approach: Import required modules.
Use the a tag to extract the links from the BeautifulSoup object. Get the actual URLs from the form all anchor tag objects with get() method and passing href argument to it. Moreover, you can get the title of the URLs with get() method and passing title argument to it.
Just get the string representation:
html_content = str(listing)
This is a non-prettified version.
If you want a prettified one, use prettify()
method:
html_content = listing.prettify()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With