Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple HTML Pretty Print

http://jsfiddle.net/JamesKyle/L4b8b/

This may be a futile effort, but I personally think its possible.

I'm not the best at Javascript or jQuery, however I think I have found a simple way of making a simple prettyprint for html.

There are four types of code in this prettyprint:

  1. Plain Text
  2. Elements
  3. Attributes
  4. Values

In order to stylize this I want to wrap elements, attibutes and values with spans with their own classes.


The first way I have of doing this is to store every single kind of element and attribute (shown below) and then wrapping them with the corresponding spans

$(document).ready(function() {

    $('pre.prettyprint.html').each(function() {

        $(this).css('white-space','pre-line');

        var code = $(this).html();

        var html-element = $(code).find('a, abbr, acronym, address, area, article, aside, audio, b, base, bdo, bdi, big, blockquote, body, br, button, canvas, caption, cite, code, col, colgroup, command, datalist, dd, del, details, dfn, div, dl, dt, em, embed, fieldset, figcaption, figure, footer, form, h1, h2, h3, h4, h5, h6, head, header, hgroup, hr, html, i, img, input, ins, kbd, keygen, label, legend, li, link, map, mark, meta, meter, nav, noscript, object, ol, optgroup, option, output, p, param, pre, progress, q, rp, rt, ruby, samp, script, section, select, small, source, span, strong, summary, style, sub, sup, table, tbody, td, textarea, tfoot, th, thead, title, time, tr, track, tt, ul, var, video, wbr');

        var html-attribute = $(code).find('abbr, accept-charset, accept, accesskey, actionm, align, alink, alt, archive, axis, background, bgcolor, border, cellpadding, cellspacing, char, charoff, charset, checked, cite, class, classid, clear, code, codebase, codetype, color, cols, colspan, compact, content, coords, data, datetime, declare, defer, dir, disabled, enctype, face, for, frame, frameborder, headers, height, href, hreflang, hspace, http-equiv, id, ismap, label, lang, language, link, longdesc, marginheight, marginwidth, maxlength, media, method, multiple, name, nohref, noresize, noshade, nowrap, object, onblur, onchange,onclick ondblclick onfocus onkeydown, onkeypress, onkeyup, onload, onmousedown, onmousemove, onmouseout, onmouseover, onmouseup, onreset, onselect, onsubmit, onunload, profile, prompt, readonly, rel, rev, rows, rowspan, rules, scheme, scope, scrolling, selected, shape, size, span, src, standby, start, style, summary, tabindex, target, text, title, type, usemap, valign, value, valuetype, version, vlink, vspace, width');

        var html-value = $(code).find(/* Any instance of text inbetween two parenthesis */);

        $(element).wrap('<span class="element" />');
        $(attribute).wrap('<span class="attribute" />');
        $(value).wrap('<span class="value" />');

        $(code).find('<').replaceWith('&lt');
        $(code).find('>').replaceWith('&gt');
    });
});

The second way I thought of was to detect elements as any amount of text surrounded by two < >'s, then detect attributes as text inside of an element that is either surrounded by two spaces or has an = immediately after it.

$(document).ready(function() {

    $('pre.prettyprint.html').each(function() {

        $(this).css('white-space','pre-line');

        var code = $(this).html();

        var html-element = $(code).find(/* Any instance of text inbeween two < > */);

        var html-attribute = $(code).find(/* Any instance of text inside an element that has a = immeadiatly afterwards or has spaces on either side */);

        var html-value = $(code).find(/* Any instance of text inbetween two parenthesis */);

        $(element).wrap('<span class="element" />');
        $(attribute).wrap('<span class="attribute" />');
        $(value).wrap('<span class="value" />');

        $(code).find('<').replaceWith('&lt');
        $(code).find('>').replaceWith('&gt');
    });
});

How would either of these be coded, if at all possible

Again you can see this as a jsfiddle here: http://jsfiddle.net/JamesKyle/L4b8b/

like image 990
James Kyle Avatar asked Dec 01 '11 21:12

James Kyle


People also ask

What is pretty print in HTML?

March 23, 2022. The HTML pretty print refers to the process of making your syntax more visually appealing by applying specific stylistic conventions. There are many reasons why doing this is necessary and one of them is creating a correct HTML print format.

How do I beautify HTML code?

To improve the formatting of your HTML source code, you can use the Format Document command Ctrl+Shift+I to format the entire file or Format Selection Ctrl+K Ctrl+F to just format the selected text. The HTML formatter is based on js-beautify.


2 Answers

Don't be so sure you have gotten all there is to pretty-printing HTML in so few lines. It took me a little more than a year and 2000 lines to really nail this topic. You can just use my code directly or refactor it to fit your needs:

https://github.com/prettydiff/prettydiff/blob/master/lib/markuppretty.js (and Github project)

You can demo it at http://prettydiff.com/?m=beautify&html

The reason why it takes so much code is that people really don't seem to understand or value the importance of text nodes. If you are adding new and empty text nodes during beautification then you are doing it wrong and are likely corrupting your content. Additionally, it is also really ease to screw it up the other way and remove white space from inside your content. You have to be careful about these or you will completely destroy the integrity of your document.

Also, what if your document contains CSS or JavaScript. Those should be pretty printed as well, but have very different requirements from HTML. Even HTML and XML have different requirements. Please take my word for it that this is not a simple thing to figure out. HTML Tidy has been at this for more than a decade and still screws up a lot of edge cases.

As far as I know my markup_beauty.js application is the most complete pretty-printer ever written for HTML/XML. I know that is a very bold statement, and perhaps arrogant, but so far its never been challenged. Look my code and if there is something you need that it is not doing please let me know and I will get around to adding it in.

like image 81
austincheney Avatar answered Oct 21 '22 04:10

austincheney


Personally I would wrap HTML with pre and not try to do any pretty printing. There are TONS of libraries for doing code formatting just google pretty print. Just wrapping HTML with pre will automatically make it 'printed' code.

For JavaScript, you can use JSON.stringify to recreate the code by passing in a number of spaces for nested structures.

JSON.stringify({ name: 'value' }, null, 2); //Change to four, for four spaces
like image 21
Drew Avatar answered Oct 21 '22 03:10

Drew