I have the following bs4 object listing: <pre class="prettyprint"><code>>>> listing <div class="listingHeader"> <h2> .... >>> type(listing) <class 'bs4.element.Tag'> </code></pre> I want to extract the raw html as a string. I've tried: <pre class="prettyprint"><code>>>> a = listing.contents >>> type(a) <type 'list'> </code></pre> So this does not work. How can I do this?

Just get the string representation: <pre class="prettyprint"><code>html_content = str(listing) </code></pre> This is a non-prettified version. If you want a prettified one, use <code>prettify()</code> method: <pre class="prettyprint"><code>html_content = listing.prettify() </code></pre>

How to get HTML from a beautiful soup object

Tags:

python

html

html-parsing

beautifulsoup

I have the following bs4 object listing:

>>> listing <div class="listingHeader"> <h2> ....   >>> type(listing) <class 'bs4.element.Tag'>

I want to extract the raw html as a string. I've tried:

>>> a = listing.contents >>> type(a) <type 'list'>

So this does not work. How can I do this?

685

asked Sep 08 '14 17:09

user1592380

1 Answers

Just get the string representation:

html_content = str(listing)

This is a non-prettified version.

If you want a prettified one, use prettify() method:

html_content = listing.prettify()

123

answered Oct 22 '22 01:10

alecxe

Related questions
                            
                                Scikit Learn SVC decision_function and predict
                            
                                Generating HTML documents in python
                            
                                How to set different levels for different python log handlers
                            
                                Why does append() always return None in Python? [duplicate]
                            
                                Cartesian product of a dictionary of lists
                            
                                UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 3 2: ordinal not in range(128)
                            
                                What is the purpose of the colon before a block in Python?
                            
                                Pandas: Appending a row to a dataframe and specify its index label
                            
                                Is there a way to get the largest integer one can use in Python? [duplicate]
                            
                                How to extend Python Enum?
                            
                                How to share conda environments across platforms
                            
                                How to determine the length of lists in a pandas dataframe column
                            
                                Getting started with the Python debugger, pdb [closed]
                            
                                Generate RFC 3339 timestamp in Python [duplicate]
                            
                                How to solve a pair of nonlinear equations using Python?
                            
                                How to convert string to datetime format in pandas python?
                            
                                In Python 2, how do I write to variable in the parent scope?
                            
                                Python requests speed up using keep-alive
                            
                                How to Copy from IPython session without terminal prompts
                            
                                How to set opacity of background colour of graph with Matplotlib