Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sanitizing user input before adding it to the DOM in Javascript

I'm writing the JS for a chat application I'm working on in my free time, and I need to have HTML identifiers that change according to user submitted data. This is usually something conceptually shaky enough that I would not even attempt it, but I don't see myself having much of a choice this time. What I need to do then is to escape the HTML id to make sure it won't allow for XSS or breaking HTML.

Here's the code:

var user_id = escape(id) var txt = '<div class="chut">'+             '<div class="log" id="chut_'+user_id+'"></div>'+             '<textarea id="chut_'+user_id+'_msg"></textarea>'+             '<label for="chut_'+user_id+'_to">To:</label>'+             '<input type="text" id="chut_'+user_id+'_to" value='+user_id+' readonly="readonly" />'+             '<input type="submit" id="chut_'+user_id+'_send" value="Message"/>'+           '</div>'; 

What would be the best way to escape id to avoid any kind of problem mentioned above? As you can see, right now I'm using the built-in escape() function, but I'm not sure of how good this is supposed to be compared to other alternatives. I'm mostly used to sanitizing input before it goes in a text node, not an id itself.

like image 808
I GIVE TERRIBLE ADVICE Avatar asked May 08 '10 12:05

I GIVE TERRIBLE ADVICE


People also ask

What is sanitizing in JavaScript?

The sanitize() method of the Sanitizer interface is used to sanitize a tree of DOM nodes, removing any unwanted elements or attributes. It should be used when the data to be sanitized is already available as DOM nodes. For example when sanitizing a Document instance in a frame.

Should you sanitize user input?

User input should always be treated as malicious before making it down into lower layers of your application. Always handle sanitizing input as soon as possible and should not for any reason be stored in your database before checking for malicious intent.

What does it mean to sanitize user input?

Input sanitization is a cybersecurity measure of checking, cleaning, and filtering data inputs from users, APIs, and web services of any unwanted characters and strings to prevent the injection of harmful codes into the system.


2 Answers

Never use escape(). It's nothing to do with HTML-encoding. It's more like URL-encoding, but it's not even properly that. It's a bizarre non-standard encoding available only in JavaScript.

If you want an HTML encoder, you'll have to write it yourself as JavaScript doesn't give you one. For example:

function encodeHTML(s) {     return s.replace(/&/g, '&amp;').replace(/</g, '&lt;').replace(/"/g, '&quot;'); } 

However whilst this is enough to put your user_id in places like the input value, it's not enough for id because IDs can only use a limited selection of characters. (And % isn't among them, so escape() or even encodeURIComponent() is no good.)

You could invent your own encoding scheme to put any characters in an ID, for example:

function encodeID(s) {     if (s==='') return '_';     return s.replace(/[^a-zA-Z0-9.-]/g, function(match) {         return '_'+match[0].charCodeAt(0).toString(16)+'_';     }); } 

But you've still got a problem if the same user_id occurs twice. And to be honest, the whole thing with throwing around HTML strings is usually a bad idea. Use DOM methods instead, and retain JavaScript references to each element, so you don't have to keep calling getElementById, or worrying about how arbitrary strings are inserted into IDs.

eg.:

function addChut(user_id) {     var log= document.createElement('div');     log.className= 'log';     var textarea= document.createElement('textarea');     var input= document.createElement('input');     input.value= user_id;     input.readonly= True;     var button= document.createElement('input');     button.type= 'button';     button.value= 'Message';      var chut= document.createElement('div');     chut.className= 'chut';     chut.appendChild(log);     chut.appendChild(textarea);     chut.appendChild(input);     chut.appendChild(button);     document.getElementById('chuts').appendChild(chut);      button.onclick= function() {         alert('Send '+textarea.value+' to '+user_id);     };      return chut; } 

You could also use a convenience function or JS framework to cut down on the lengthiness of the create-set-appends calls there.

ETA:

I'm using jQuery at the moment as a framework

OK, then consider the jQuery 1.4 creation shortcuts, eg.:

var log= $('<div>', {className: 'log'}); var input= $('<input>', {readOnly: true, val: user_id}); ... 

The problem I have right now is that I use JSONP to add elements and events to a page, and so I can not know whether the elements already exist or not before showing a message.

You can keep a lookup of user_id to element nodes (or wrapper objects) in JavaScript, to save putting that information in the DOM itself, where the characters that can go in an id are restricted.

var chut_lookup= {}; ...  function getChut(user_id) {     var key= '_map_'+user_id;     if (key in chut_lookup)         return chut_lookup[key];     return chut_lookup[key]= addChut(user_id); } 

(The _map_ prefix is because JavaScript objects don't quite work as a mapping of arbitrary strings. The empty string and, in IE, some Object member names, confuse it.)

like image 142
bobince Avatar answered Sep 19 '22 14:09

bobince


You can use this:

function sanitize(string) {   const map = {       '&': '&amp;',       '<': '&lt;',       '>': '&gt;',       '"': '&quot;',       "'": '&#x27;',       "/": '&#x2F;',   };   const reg = /[&<>"'/]/ig;   return string.replace(reg, (match)=>(map[match])); } 

Also see OWASP XSS Prevention Cheat Sheet.

like image 40
SilentImp Avatar answered Sep 22 '22 14:09

SilentImp