The question did arise from this one:
Why does the browser modify the ID of an HTML element that contains &#x?
Given the following web page:
<html>
<head>
<script type="text/javascript">
// --------------------------------------------------------
// could calling this method produce an XSS attack?
// --------------------------------------------------------
function decodeEntity(text){
text = text.replace(/<(.*?)>/g,''); // strip out all HTML tags, to prevent possible XSS
var div = document.createElement('div');
div.innerHTML = text;
return div.textContent?div.textContent:div.innerText;
}
function echoValue(){
var e = document.getElementById(decodeEntity("/path/$whatever"));
if(e) {
alert(e.innerHTML);
}
else {
alert("not found\n");
}
}
</script>
</head>
<body>
<p id="/path/$whatever">The Value</p>
<button onclick="echoValue()">Tell me</button>
</body>
</html>
The id
of the <p>
element contains characters that were escaped in order to prevent XSS attacks. The HTML part and JS part are generated by the server and the server inserts the same escaped value (which could origin from an unsecure source) on both parts.
The server escapes the following character ranges in the &#x
format:
In other words: the only characters that are not escaped are:
.
, /
, 0123456789
)A
– Z
)_
)a
– z
)Now, I have to get access to that <p>
through javascript. The function echoValue()
in the referenced question always failed because the browser converts $
to $
in the HTML part but leaves it as $
in the JS part.
So, Gareth came up with an answer that is simple and works.
My concern is that the possibility of an XSS attack that was eliminated by escaping the dynamic strings will arise again when using the decodeEntity()
function provided in the referenced answer.
Could anybody point out whether there might be security concerns (which?) or not (why not?)?
I first suggest you have a look at the following links discussing HTML sanitation in JavaScript and XSS in Javascript:
Security Lesson no 1: Don't reinvent the wheel. If something has been done before, chances are they did a better job than your ad hoc solution.
Even though I can't from the top of my mind find a way to exploit your simple regex I am not conviced it really captures all cases. The first link provides a solution that is more elaborated and has been reviewed and tested thoroughly.
I also suggest you look at XSS Filter Evasion Cheat Sheet. Shows you real good what kind of nasty things people might come up with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With