Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Pretty Print HTML to a file, with indentation

I am using lxml.html to generate some HTML. I want to pretty print (with indentation) my final result into an html file. How do I do that?

This is what I have tried and got till now

import lxml.html as lh from lxml.html import builder as E sliderRoot=lh.Element("div", E.CLASS("scroll"), style="overflow-x: hidden; overflow-y: hidden;") scrollContainer=lh.Element("div", E.CLASS("scrollContainer"), style="width: 4340px;") sliderRoot.append(scrollContainer) print lh.tostring(sliderRoot, pretty_print = True, method="html") 

As you can see I am using the pretty_print=True attribute. I thought that would give indented code, but it doesn't really help. This is the output :

<div style="overflow-x: hidden; overflow-y: hidden;" class="scroll"><div style="width: 4340px;" class="scrollContainer"></div></div>

like image 788
bcosynot Avatar asked May 27 '11 09:05

bcosynot


People also ask

Which of the following statements is used to get pretty HTML as output?

html. soupparser uses for parsing HTML. BeautifulSoup has a prettify method that does exactly what it says it does. It prettifies the HTML with proper indents and everything.

How do I print an HTML file in Python?

In order to display the HTML file as a python output, we will be using the codecs library. This library is used to open files which have a certain encoding. It takes a parameter encoding which makes it different from the built-in open() function.


1 Answers

I ended up using BeautifulSoup directly. That is something lxml.html.soupparser uses for parsing HTML.

BeautifulSoup has a prettify method that does exactly what it says it does. It prettifies the HTML with proper indents and everything.

BeautifulSoup will NOT fix the HTML, so broken code, remains broken. But in this case, since the code is being generated by lxml, the HTML code should be at least semantically correct.

In the example given in my question, I will have to do this :

from BeautifulSoup import BeautifulSoup as bs root = lh.tostring(sliderRoot) #convert the generated HTML to a string soup = bs(root)                #make BeautifulSoup prettyHTML = soup.prettify()   #prettify the html 
like image 121
bcosynot Avatar answered Sep 20 '22 17:09

bcosynot