Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What do I need to escape inside the html <pre> tag

Tags:

I use the <pre> tag in my blog to post code. I know I have to change < to &lt; and > to &gt;. Are any other characters I need to escape for correct html?

like image 443
Alec Jacobson Avatar asked Aug 13 '09 16:08

Alec Jacobson


2 Answers

What happens if you use the <pre> tag to display HTML markup on your blog:

<pre>Use a <span style="background: yellow;">span tag with style attribute</span> to hightlight words</pre>

This will pass HTML validation, but does it produce the expected result? No. The correct way is:

<pre>Use a &lt;span style=&quot;background: yellow;&quot;&gt;span tag with style attribute&lt;/span&gt; to hightlight words</pre>

Another example: if you use the pre tag to display some other language code, the HTML encoding is still required:

<pre>if (i && j) return;</pre>

This might produce the expected result but does it pass HTML validation? No. The correct way is:

<pre>if (i &amp;&amp; j) return;</pre>

Long story short, HTML-encode the content of a pre tag just the way you do with other tags.

like image 139
Salman A Avatar answered Sep 22 '22 06:09

Salman A


TL;DR

  • PHP: htmlspecialchars($html);
  • JavaScript(JS): Element.innerText = "<html>...";

Note that <pre> is just for styles, so you have to escape ALL HTML.

Only For You HTML "fossil"s: using <xmp> tag

This is not well known, but it really does exist and even chrome still supports it, however using a pair of <xmp> tag is NOT recommended to be relied on - it's just for you HTML fossils, but it's a very simple way to handle your personal content, e.g. DOCS. Even the w3.org Wiki says in its example: "No, really. don't use it."

You can put ANY HTML (excluding </xmp> end tag) inside <xmp></xmp>

<xmp>
<html> <br> just any other html tags...
</xmp>

The proper version

Proper version could be considered to be HTML stored as a STRING and displayed with the help of some escaping function/mechanism.

Just remember one thing - the strings in C-like languages are usually written between single quotes or double quotes - if you wrap your string in double => you should escape doubles (probably with \), if you wrap your string in single => escape singles (probably with \)...

The most frequent - Server-side language escaping (ex. in PHP)

Server-side scripting languages often have some built-in function to escape HTML.

<?php
   $html = "<html> <br> or just any other HTML"; //store html
   echo htmlspecialchars($html); //display escaped html
?>

Note that in PHP 8.1 there was a change so you no longer have to specify ENT_QUOTES flag:

flags changed from ENT_COMPAT to ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML401.

The client-side way (example in JavaScript / JS&jQuery)

Similar approach as on server-side is achievable in client-side scripts.

Pure JavaScript

There is no function, but there is the default behavior, if you set element's innerText or node's textContent:

document.querySelector('.myTest').innerText = "<html><head>...";
document.querySelector('.myTest').textContent = "<html><head>...";

HTMLElement.innerText and Node.textContent are not the same thing! You can find out more about the difference in the MDN doc links above

jQuery (a JS library)

jQuery has $jqueryEl.text() for this purpose:
$('.mySomething .test').text("<html><head></head><body class=\"test\">...");

Just remember the same thing as for server-side - in C-like languages, escape the quotes you've wrapped your string in.

like image 33
jave.web Avatar answered Sep 18 '22 06:09

jave.web