Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get HTML from a beautiful soup object

I have the following bs4 object listing:

>>> listing <div class="listingHeader"> <h2> ....   >>> type(listing) <class 'bs4.element.Tag'> 

I want to extract the raw html as a string. I've tried:

>>> a = listing.contents >>> type(a) <type 'list'> 

So this does not work. How can I do this?

like image 685
user1592380 Avatar asked Sep 08 '14 17:09

user1592380


People also ask

How do you get the HTML code from BeautifulSoup?

Using a parser you are comfortable with It's fairly easy to crawl through the web pages using BeautifulSoup. To get all the HTML tags of a web page using the BeautifulSoup library first import BeautifulSoup and requests library to make a GET request to the web page. Step-by-step Approach: Import required modules.

How do I get the URL of BeautifulSoup?

Use the a tag to extract the links from the BeautifulSoup object. Get the actual URLs from the form all anchor tag objects with get() method and passing href argument to it. Moreover, you can get the title of the URLs with get() method and passing title argument to it.


1 Answers

Just get the string representation:

html_content = str(listing) 

This is a non-prettified version.

If you want a prettified one, use prettify() method:

html_content = listing.prettify() 
like image 123
alecxe Avatar answered Oct 22 '22 01:10

alecxe