Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to indent html with xmllint?

Tags:

html

xmllint

I'm outputting html that's all crushed together, and would like to convert it to have proper indentation. I've been trying to use xmllint for this, but with no joy. E.g. when this is in file.html:

<table><tr><td><b>Foo</b></td></tr></table>
<table><tr><td>Bar</td></tr></table>

I get:

$ xmllint --format file.html
file.html:2: parser error : Extra content at the end of the document
<table><tr><td>Bar</td></tr></table>
^
<<< exit status [1] >>>

But when file.html contains either of those lines alone, it works fine (removing the second line):

$ xmllint --format file.html
<?xml version="1.0"?>
<table>
  <tr>
    <td>
      <b>Foo</b>
    </td>
  </tr>
</table>

When i inlcude the --html option, it's more likely to run without errors, but then it doesn't indent.

Any suggestions? Are there any other (*nix) tools I can use for this? Thanks ...

like image 272
Mori Avatar asked Oct 29 '25 14:10

Mori


1 Answers

As user 4M01 suggested: On the command line, append the pipe with a call to HTML tidy.

HTML output from xmllint will be repaired; tidy will wrap some reasonable ... around your html fragment.

xmllint --xpath "//tr[6]/td[7]" --html - | tidy -q
like image 110
knb Avatar answered Oct 31 '25 04:10

knb