Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unescape HTML entities in JavaScript?

I have some JavaScript code that communicates with an XML-RPC backend. The XML-RPC returns strings of the form:

<img src='myimage.jpg'> 

However, when I use the JavaScript to insert the strings into HTML, they render literally. I don't see an image, I literally see the string:

<img src='myimage.jpg'> 

My guess is that the HTML is being escaped over the XML-RPC channel.

How can I unescape the string in JavaScript? I tried the techniques on this page, unsuccessfully: http://paulschreiber.com/blog/2008/09/20/javascript-how-to-unescape-html-entities/

What are other ways to diagnose the issue?

like image 874
Joseph Turian Avatar asked Dec 16 '09 05:12

Joseph Turian


People also ask

How do you unescape in JavaScript?

JavaScript unescape() Function The unescape() function in JavaScript takes a string as a parameter and uses to decode that string encoded by the escape() function. The hexadecimal sequence in the string is replaced by the characters they represent when decoded via unescape().

How do I unescape a string in HTML?

One way to unescape HTML entities is to put our escaped text in a text area. This will unescape the text, so we can return the unescaped text afterward by getting the text from the text area. We have an htmlDecode function that takes an input string as a parameter.

What is unescape () and escape () functions?

1. The unescape() function is used to decode that string encoded by the escape() function. The escape() function in JavaScript is used for encoding a string.


2 Answers

Most answers given here have a huge disadvantage: if the string you are trying to convert isn't trusted then you will end up with a Cross-Site Scripting (XSS) vulnerability. For the function in the accepted answer, consider the following:

htmlDecode("<img src='dummy' onerror='alert(/xss/)'>"); 

The string here contains an unescaped HTML tag, so instead of decoding anything the htmlDecode function will actually run JavaScript code specified inside the string.

This can be avoided by using DOMParser which is supported in all modern browsers:

function htmlDecode(input) {    var doc = new DOMParser().parseFromString(input, "text/html");    return doc.documentElement.textContent;  }    console.log(  htmlDecode("&lt;img src='myimage.jpg'&gt;")  )      // "<img src='myimage.jpg'>"    console.log(  htmlDecode("<img src='dummy' onerror='alert(/xss/)'>")  )    // ""

This function is guaranteed to not run any JavaScript code as a side-effect. Any HTML tags will be ignored, only text content will be returned.

Compatibility note: Parsing HTML with DOMParser requires at least Chrome 30, Firefox 12, Opera 17, Internet Explorer 10, Safari 7.1 or Microsoft Edge. So all browsers without support are way past their EOL and as of 2017 the only ones that can still be seen in the wild occasionally are older Internet Explorer and Safari versions (usually these still aren't numerous enough to bother).

like image 120
Wladimir Palant Avatar answered Oct 09 '22 22:10

Wladimir Palant


Do you need to decode all encoded HTML entities or just &amp; itself?

If you only need to handle &amp; then you can do this:

var decoded = encoded.replace(/&amp;/g, '&'); 

If you need to decode all HTML entities then you can do it without jQuery:

var elem = document.createElement('textarea'); elem.innerHTML = encoded; var decoded = elem.value; 

Please take note of Mark's comments below which highlight security holes in an earlier version of this answer and recommend using textarea rather than div to mitigate against potential XSS vulnerabilities. These vulnerabilities exist whether you use jQuery or plain JavaScript.

like image 25
LukeH Avatar answered Oct 09 '22 20:10

LukeH