In JavaScript (server side nodejs) I'm writing a program which generates xml as output.
I am building the xml by concatenating a string:
str += '<' + key + '>'; str += value; str += '</' + key + '>';   The problem is: What if value contains characters like '&', '>' or '<'? What's the best way to escape those characters?
or is there any javascript library around which can escape XML entities?
XML escape characters There are only five: " " ' ' < < > > & & Escaping characters depends on where the special character is used. The examples can be validated at the W3C Markup Validation Service.
Using the Escape Character ( \ ) We can use the backslash ( \ ) escape character to prevent JavaScript from interpreting a quote as the end of the string. The syntax of \' will always be a single quote, and the syntax of \" will always be a double quote, without any fear of breaking the string.
The only illegal characters are & , < and > (as well as " or ' in attributes, depending on which character is used to delimit the attribute value: attr="must use " here, ' is allowed" and attr='must use ' here, " is allowed' ). They're escaped using XML entities, in this case you want & for & .
HTML encoding is simply replacing &, ", ', < and > chars with their entity equivalents. Order matters, if you don't replace the & chars first, you'll double encode some of the entities:
if (!String.prototype.encodeHTML) {   String.prototype.encodeHTML = function () {     return this.replace(/&/g, '&')                .replace(/</g, '<')                .replace(/>/g, '>')                .replace(/"/g, '"')                .replace(/'/g, ''');   }; }   As @Johan B.W. de Vries pointed out, this will have issues with the tag names, I would like to clarify that I made the assumption that this was being used for the value only
Conversely if you want to decode HTML entities1, make sure you decode & to & after everything else so that you don't double decode any entities:
if (!String.prototype.decodeHTML) {   String.prototype.decodeHTML = function () {     return this.replace(/'/g, "'")                .replace(/"/g, '"')                .replace(/>/g, '>')                .replace(/</g, '<')                .replace(/&/g, '&');   }; }   1 just the basics, not including © to © or other such things
As far as libraries are concerned. Underscore.js (or Lodash if you prefer) provides an _.escape method to perform this functionality.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With