I have a strange problem: In the database, I have a literal ampersand lt semicolon: <pre class="prettyprint"><code>&lt;div </code></pre> whenever its printed into a html textarea tag, the source code of the page shows the <code>&gt;</code> as <code>></code>. How do I stop this decoding?

You can't stop entities being decoded in a textarea[1] since the content of a textarea is not (unlike a script or style element) intrinsic CDATA, even though error recovery may sometimes give the impression that it is. The definition of the textarea element is: <pre class="prettyprint"><code><!ELEMENT TEXTAREA - - (#PCDATA) -- multi-line text field --> </code></pre> i.e. it contains PCDATA which is described as: <blockquote> Document text (indicated by the SGML construct "#PCDATA"). Text may contain character references. Recall that these begin with <code>&</code> and end with a semicolon (e.g., <code>Herg&eacute;'s adventures of Tintin</code> contains the character entity reference for the e acute character). </blockquote> This means that when you type (the invalid HTML of) "start of tag" (<code><</code>) the browser corrects it to "less than sign" (<code>&lt;</code>) but when you type "start of entity" (<code>&</code>), which is allowed, no error correction takes place. You need to write what you mean. If you want to include some HTML as data then you must convert any character with special meaning to its respective character reference. If the data is: <pre class="prettyprint"><code>&lt;div </code></pre> Then the HTML must be: <pre class="prettyprint"><code><textarea>&amp;lt;div</textarea> </code></pre> You can use the standard functions for converting this (e.g. PHP's <code>htmlspecialchars</code> or Perl's HTML::Entities module). NB 1: If you were using XHTML[2] (and really using it, it doesn't count if you serve it as text/html) then you could use an explicit CDATA block: <pre class="prettyprint"><code><textarea><![CDATA[&lt;div]]></textarea> </code></pre> NB 2: Or if browsers implemented HTML 4 correctly <hr> <blockquote> Ok , but the question is . why it decodes them anyway ? assuming i've added & , save the textarea , ti will be saved &lt; , but displayed as < , saving it again will convert it back to < (but it will remain < in the database) , saving again will save it a < in the database , why the textarea decodes it ? </blockquote> <ul> <li>The server sends (to the browser) data encoded as HTML. </li> <li>The browser sends (to the server) data encoded as application/x-www-form-urlencoded (or multipart/form-data).</li> </ul> Since the browser is not sending the data as HTML, the characters are not represented as HTML entities. If you take the data received from the client and then put it into an HTML document, then you must encode it as HTML first.

In PHP, this can be done using htmlentities(). Example below. <pre class="prettyprint"><code><?php $content = "This string contains the TM symbol: &trade;"; print "<textarea>". htmlentities($content) ."</textarea>"; ?> </code></pre> Without htmlentities(), the textarea would interpret and display the TM symbol (™) instead of "&trade;". http://php.net/manual/en/function.htmlentities.php

How to stop an html TEXTAREA from decoding html entities

Tags:

I have a strange problem:

In the database, I have a literal ampersand lt semicolon:

&lt;div

whenever its printed into a html textarea tag, the source code of the page shows the > as >.

How do I stop this decoding?

243

asked Dec 15 '11 11:12

Rami Dabain

2 Answers

You can't stop entities being decoded in a textarea[1] since the content of a textarea is not (unlike a script or style element) intrinsic CDATA, even though error recovery may sometimes give the impression that it is.

The definition of the textarea element is:

<!ELEMENT TEXTAREA - - (#PCDATA)       -- multi-line text field -->

i.e. it contains PCDATA which is described as:

Document text (indicated by the SGML construct "#PCDATA"). Text may contain character references. Recall that these begin with & and end with a semicolon (e.g., Hergé's adventures of Tintin contains the character entity reference for the e acute character).

This means that when you type (the invalid HTML of) "start of tag" (<) the browser corrects it to "less than sign" (<) but when you type "start of entity" (&), which is allowed, no error correction takes place.

You need to write what you mean. If you want to include some HTML as data then you must convert any character with special meaning to its respective character reference.

If the data is:

&lt;div

Then the HTML must be:

<textarea>&amp;lt;div</textarea>

You can use the standard functions for converting this (e.g. PHP's htmlspecialchars or Perl's HTML::Entities module).

NB 1: If you were using XHTML[2] (and really using it, it doesn't count if you serve it as text/html) then you could use an explicit CDATA block:

<textarea><![CDATA[&lt;div]]></textarea>

NB 2: Or if browsers implemented HTML 4 correctly

Ok , but the question is . why it decodes them anyway ? assuming i've added & , save the textarea , ti will be saved < , but displayed as < , saving it again will convert it back to < (but it will remain < in the database) , saving again will save it a < in the database , why the textarea decodes it ?

The server sends (to the browser) data encoded as HTML.
The browser sends (to the server) data encoded as application/x-www-form-urlencoded (or multipart/form-data).

Since the browser is not sending the data as HTML, the characters are not represented as HTML entities.

If you take the data received from the client and then put it into an HTML document, then you must encode it as HTML first.

145

answered Sep 18 '22 01:09

Quentin

In PHP, this can be done using htmlentities(). Example below.

<?php   $content = "This string contains the TM symbol: &trade;";   print "<textarea>". htmlentities($content) ."</textarea>"; ?>

Without htmlentities(), the textarea would interpret and display the TM symbol (™) instead of "™".

http://php.net/manual/en/function.htmlentities.php

answered Sep 21 '22 01:09

Chris Hubbard

Related questions
                            
                                how to set setOnClickListener for AutoCompleteTextView?
                            
                                Reflecting Heroku push version within the app
                            
                                Interpreting callgrind data
                            
                                How to make JUnit test cases to run in sequential order?
                            
                                FLAG_ACTIVITY_NEW_TASK clarification needed
                            
                                Common HTTPclient and proxy
                            
                                Copy folders without files, files without folders, or everything using PowerShell
                            
                                Maximize WebDriver (Selenium 2) in Python
                            
                                awaiting on an observable
                            
                                Android: got CalledFromWrongThreadException in onPostExecute() - How could it be?
                            
                                DataModel must implement org.primefaces.model.SelectableDataModel when selection is enabled
                            
                                How to do multiple spans twitter bootstrap without spacing between

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With