I have a textbox in a form which needs to accept input with HTML tags.
Submitting input with HTML tags in makes the app throw a HttpRequestValidationException
, unless we use HttpUtility.HtmlEncode
. Easy so far.
However, the input may also contain symbols, such as the 'degrees' symbol (°). When these are also HTML encoded, they become numeric escape codes, in this example °
. These codes also cause HttpRequestValidationException
to be thrown, but the question is why?
I can't see why numeric escape codes are thought of as potentially dangerous, especially as °
works as input just fine.
I seem to be stuck, as leaving the input as-is fails due to the tags, and HTML encoding the input fails due to the numeric escapes. My solution so far has been to HTML encode, then regex replace the escape sequences with their HTML decoded forms, but I'm not sure if this is a safe solution, as I assume the escape sequences are seen as dangerous for a reason.
This is due to ASP.NET builtin Cross Site Scripting validation capabilities. There is some kind of a list of what's allowed and what's not by ASP.NET, here on SO: ASP.NET request validation causes: is there a list?
On the specific case of # encoded characters, there is a complete reference of XSS attacks available here: XSS (Cross Site Scripting) Cheat Sheet that demonstrate how complex these attacks can be, and why encoded characters are forbidden.
ASP.NET considers html char escapes (&#xxx) dangerous for the same reason it considers angled bracket dangerous i.e. XSS. Using above escape, you can include any character (for example, angled bracket). Here's summary of what request validation does in 1.1 and 2.0.
In legitimate cases such as your case, you can choose any of below
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With