Is it possible to use HTML Tidy to just indent HTML code?
Sample Code
<form action="?" method="get" accept-charset="utf-8"> <ul> <li> <label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q" /> </li> <li><input class="submit" type="submit" value="Search" /></li> </ul> </form>
Desired Result
<form action="?" method="get" accept-charset="utf-8"> <ul> <li> <label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q"/> </li> <li><input class="submit" type="submit" value="Search"/></li> </ul> </form>
If I run it with the standard command, tidy -f errs.txt -m index.html
then I get this
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html> <head> <meta name="generator" content= "HTML Tidy for Mac OS X (vers 31 October 2006 - Apple Inc. build 15.3.6), see www.w3.org"> <title></title> </head> <body> <form action="?" method="get" accept-charset="utf-8"> <ul> <li><label class="screenReader" for= "q">Keywords</label><input type="text" name="q" value="" id= "q"></li> <li><input class="submit" type="submit" value="Search"></li> </ul> </form> </body> </html>
How can I omit all the extra stuff and actually get it to indent the code?
Forgive me if that's not a feature that it's supposed to support, what library / tool am I looking for?
Many tools generate HTML with an excess of FONT, NOBR and CENTER tags. Tidy's -clean option will replace them by style properties and rules using CSS. This makes the markup easier to read and maintain as well as reducing the file size!
HTML code does not need to be indented, and all browsers and search engines ignore indentation and extra spacing. However, for any human reader, it's a good idea to indent your text because it makes the code easier to scan and read.
Tidy is a quite powerful program which main purpose is to fix errors in HTML documents. TidyLib is a library version of Tidy written in C and by reason of easy C linkage, it can be used from within nearly any programming language, including PHP.
Use the indent
, tidy-mark
, and quiet
options:
tidy \ -indent \ --indent-spaces 2 \ -quiet \ --tidy-mark no \ index.html
Or, using a config file rather than command-line options:
indent: auto indent-spaces: 2 quiet: yes tidy-mark: no
Name it tidy_config.txt
and save it the same directory as the .html file. Run it like this:
tidy -config tidy_config.txt index.html
For more customization, use the tidy man page to find other relevant options such as markup: no
or force-output: yes
.
I didn't found a possibility "only reindent - without any changes". The next config file will "repair" as low as possible and (mostly) only re-indent the html. Tidy
still correcting some errorish conditions, like duplicated (repeated) attributes.
#based on http://tidy.sourceforge.net/docs/quickref.html #HTML, XHTML, XML Options Reference anchor-as-name: no #? doctype: omit drop-empty-paras: no fix-backslash: no fix-bad-comments: no fix-uri:no hide-endtags: yes #? #input-xml: yes #? join-styles: no literal-attributes: yes lower-literals: no merge-divs: no merge-spans: no output-html: yes preserve-entities: yes quote-ampersand: no quote-nbsp: no show-body-only: auto #Diagnostics Options Reference show-errors: 0 show-warnings: 0 #Pretty Print Options Reference break-before-br: yes indent: yes indent-attributes: no #default indent-spaces: 4 tab-size: 4 wrap: 132 wrap-asp: no wrap-jste: no wrap-php: no wrap-sections: no #Character Encoding Options Reference char-encoding: utf8 #Miscellaneous Options Reference force-output: yes quiet: yes tidy-mark: no
For example the next html-fragment
<div> <div> <p> not closed para <h1> h1 head </h1> <ul> <li>not closed li <li>closed li</li> </ul> some text </div> </div>
will changed to
<div> <div> <p> not closed para <h1> h1 head </h1> <ul> <li>not closed li <li>closed li </ul>some text </div> </div>
As you can notice, the hide-endtags: yes
hides the closing </li>
from the second bullet in the input. Setting the hide-endtags: no
- will get the next:
<div> <div> <p> not closed para </p> <h1> h1 head </h1> <ul> <li>not closed li </li> <li>closed li </li> </ul>some text </div> </div>
so, tidy
adds closing </p>
and closing </li>
to first bullet.
I didn't found a possibility preserve everything on input and only reindent the file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With