I have some JavaScript code that communicates with an XML-RPC backend. The XML-RPC returns strings of the form:
<img src='myimage.jpg'>
However, when I use the JavaScript to insert the strings into HTML, they render literally. I don't see an image, I literally see the string:
<img src='myimage.jpg'>
My guess is that the HTML is being escaped over the XML-RPC channel.
How can I unescape the string in JavaScript? I tried the techniques on this page, unsuccessfully: http://paulschreiber.com/blog/2008/09/20/javascript-how-to-unescape-html-entities/
What are other ways to diagnose the issue?
JavaScript unescape() Function The unescape() function in JavaScript takes a string as a parameter and uses to decode that string encoded by the escape() function. The hexadecimal sequence in the string is replaced by the characters they represent when decoded via unescape().
One way to unescape HTML entities is to put our escaped text in a text area. This will unescape the text, so we can return the unescaped text afterward by getting the text from the text area. We have an htmlDecode function that takes an input string as a parameter.
1. The unescape() function is used to decode that string encoded by the escape() function. The escape() function in JavaScript is used for encoding a string.
Most answers given here have a huge disadvantage: if the string you are trying to convert isn't trusted then you will end up with a Cross-Site Scripting (XSS) vulnerability. For the function in the accepted answer, consider the following:
htmlDecode("<img src='dummy' onerror='alert(/xss/)'>");
The string here contains an unescaped HTML tag, so instead of decoding anything the htmlDecode
function will actually run JavaScript code specified inside the string.
This can be avoided by using DOMParser which is supported in all modern browsers:
function htmlDecode(input) { var doc = new DOMParser().parseFromString(input, "text/html"); return doc.documentElement.textContent; } console.log( htmlDecode("<img src='myimage.jpg'>") ) // "<img src='myimage.jpg'>" console.log( htmlDecode("<img src='dummy' onerror='alert(/xss/)'>") ) // ""
This function is guaranteed to not run any JavaScript code as a side-effect. Any HTML tags will be ignored, only text content will be returned.
Compatibility note: Parsing HTML with DOMParser
requires at least Chrome 30, Firefox 12, Opera 17, Internet Explorer 10, Safari 7.1 or Microsoft Edge. So all browsers without support are way past their EOL and as of 2017 the only ones that can still be seen in the wild occasionally are older Internet Explorer and Safari versions (usually these still aren't numerous enough to bother).
Do you need to decode all encoded HTML entities or just &
itself?
If you only need to handle &
then you can do this:
var decoded = encoded.replace(/&/g, '&');
If you need to decode all HTML entities then you can do it without jQuery:
var elem = document.createElement('textarea'); elem.innerHTML = encoded; var decoded = elem.value;
Please take note of Mark's comments below which highlight security holes in an earlier version of this answer and recommend using textarea
rather than div
to mitigate against potential XSS vulnerabilities. These vulnerabilities exist whether you use jQuery or plain JavaScript.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With