Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use HTML Tidy to just indent HTML code?

Tags:

html

htmltidy

Is it possible to use HTML Tidy to just indent HTML code?

Sample Code

<form action="?" method="get" accept-charset="utf-8">  <ul> <li> <label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q" /> </li> <li><input class="submit" type="submit" value="Search" /></li> </ul>   </form> 

Desired Result

<form action="?" method="get" accept-charset="utf-8">     <ul>         <li>         <label class="screenReader" for="q">Keywords</label><input type="text" name="q" value="" id="q"/>         </li>         <li><input class="submit" type="submit" value="Search"/></li>     </ul> </form> 

If I run it with the standard command, tidy -f errs.txt -m index.html then I get this

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html> <head> <meta name="generator" content= "HTML Tidy for Mac OS X (vers 31 October 2006 - Apple Inc. build 15.3.6), see www.w3.org"> <title></title> </head> <body> <form action="?" method="get" accept-charset="utf-8"> <ul> <li><label class="screenReader" for= "q">Keywords</label><input type="text" name="q" value="" id= "q"></li> <li><input class="submit" type="submit" value="Search"></li> </ul> </form> </body> </html> 

How can I omit all the extra stuff and actually get it to indent the code?

Forgive me if that's not a feature that it's supposed to support, what library / tool am I looking for?

like image 270
cwd Avatar asked Aug 22 '11 17:08

cwd


People also ask

What does HTML Tidy do?

Many tools generate HTML with an excess of FONT, NOBR and CENTER tags. Tidy's -clean option will replace them by style properties and rules using CSS. This makes the markup easier to read and maintain as well as reducing the file size!

Should I indent HTML?

HTML code does not need to be indented, and all browsers and search engines ignore indentation and extra spacing. However, for any human reader, it's a good idea to indent your text because it makes the code easier to scan and read.

What does PHP tidy do?

Tidy is a quite powerful program which main purpose is to fix errors in HTML documents. TidyLib is a library version of Tidy written in C and by reason of easy C linkage, it can be used from within nearly any programming language, including PHP.


2 Answers

Use the indent, tidy-mark, and quiet options:

tidy \   -indent \   --indent-spaces 2 \   -quiet \   --tidy-mark no \   index.html 

Or, using a config file rather than command-line options:

indent: auto indent-spaces: 2 quiet: yes tidy-mark: no 

Name it tidy_config.txt and save it the same directory as the .html file. Run it like this:

tidy -config tidy_config.txt index.html 

For more customization, use the tidy man page to find other relevant options such as markup: no or force-output: yes.

like image 193
Paul Sweatte Avatar answered Sep 23 '22 11:09

Paul Sweatte


I didn't found a possibility "only reindent - without any changes". The next config file will "repair" as low as possible and (mostly) only re-indent the html. Tidy still correcting some errorish conditions, like duplicated (repeated) attributes.

#based on http://tidy.sourceforge.net/docs/quickref.html #HTML, XHTML, XML Options Reference anchor-as-name: no  #? doctype: omit drop-empty-paras: no fix-backslash: no fix-bad-comments: no fix-uri:no hide-endtags: yes   #? #input-xml: yes     #? join-styles: no literal-attributes: yes lower-literals: no merge-divs: no merge-spans: no output-html: yes preserve-entities: yes quote-ampersand: no quote-nbsp: no show-body-only: auto  #Diagnostics Options Reference show-errors: 0 show-warnings: 0  #Pretty Print Options Reference break-before-br: yes indent: yes indent-attributes: no   #default indent-spaces: 4 tab-size: 4 wrap: 132 wrap-asp: no wrap-jste: no wrap-php: no wrap-sections: no  #Character Encoding Options Reference char-encoding: utf8  #Miscellaneous Options Reference force-output: yes quiet: yes tidy-mark: no 

For example the next html-fragment

<div> <div> <p> not closed para <h1> h1 head </h1> <ul> <li>not closed li <li>closed li</li> </ul> some text </div> </div> 

will changed to

<div>     <div>         <p>             not closed para         <h1>             h1 head         </h1>         <ul>             <li>not closed li             <li>closed li             </ul>some text     </div> </div> 

As you can notice, the hide-endtags: yes hides the closing </li> from the second bullet in the input. Setting the hide-endtags: no - will get the next:

<div>     <div>         <p>             not closed para         </p>         <h1>             h1 head         </h1>         <ul>             <li>not closed li             </li>             <li>closed li             </li>         </ul>some text     </div> </div> 

so, tidy adds closing </p> and closing </li> to first bullet.

I didn't found a possibility preserve everything on input and only reindent the file.

like image 33
kobame Avatar answered Sep 21 '22 11:09

kobame