I am using lxml.html to generate some HTML. I want to pretty print (with indentation) my final result into an html file. How do I do that?
This is what I have tried and got till now
import lxml.html as lh from lxml.html import builder as E sliderRoot=lh.Element("div", E.CLASS("scroll"), style="overflow-x: hidden; overflow-y: hidden;") scrollContainer=lh.Element("div", E.CLASS("scrollContainer"), style="width: 4340px;") sliderRoot.append(scrollContainer) print lh.tostring(sliderRoot, pretty_print = True, method="html")
As you can see I am using the pretty_print=True
attribute. I thought that would give indented code, but it doesn't really help. This is the output :
<div style="overflow-x: hidden; overflow-y: hidden;" class="scroll"><div style="width: 4340px;" class="scrollContainer"></div></div>
html. soupparser uses for parsing HTML. BeautifulSoup has a prettify method that does exactly what it says it does. It prettifies the HTML with proper indents and everything.
In order to display the HTML file as a python output, we will be using the codecs library. This library is used to open files which have a certain encoding. It takes a parameter encoding which makes it different from the built-in open() function.
I ended up using BeautifulSoup directly. That is something lxml.html.soupparser uses for parsing HTML.
BeautifulSoup has a prettify method that does exactly what it says it does. It prettifies the HTML with proper indents and everything.
BeautifulSoup will NOT fix the HTML, so broken code, remains broken. But in this case, since the code is being generated by lxml, the HTML code should be at least semantically correct.
In the example given in my question, I will have to do this :
from BeautifulSoup import BeautifulSoup as bs root = lh.tostring(sliderRoot) #convert the generated HTML to a string soup = bs(root) #make BeautifulSoup prettyHTML = soup.prettify() #prettify the html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With