Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Allow Double URL Encoded Request Paths To Be Valid

I have a standard ASP.Net WebForms application running on IIS 7.0 with an Integrated Managed Pipeline. Many of the images on our site have spaces in their files names (e.g. './baseball drawing.gif'). When we place these images into our html pages we url encode the paths so that our html img tags look like this <img src='./baseball%20drawing.gif' />

Now, the problem comes in when certain search engines and webcrawlers try to index our site. When they scrape our pages they will html encode our already html-encoded paths getting image links like this './baseball%2520drawing.gif' where %25 is the url encoding for '%'. This causes two problems:

  1. When users get results from these search engines they receive broken links.
  2. When users attempt to navigate to these broken links it throws errors in our system.

As you can see this is a lose lose situation. Users get broken links, and we get noise in our error logs.

I've been trying to figure out how to correct this problem with no luck. Here is what I've tried:

  1. Set <requestFiltering allowDoubleEscaping='true'> in web.config to prevent the "404.11 URL Double Escaped error". This fixed the first error but caused a new one, "a potentially dangerous Request.Path was found".
  2. Removed the '%' from the <httpRuntime requestPathInvalidChars> to prevent the "potentially dangerous Request.Path" error. This fixed the second error but now we have a third one, "Resource can't be found".
  3. I placed a break in my code to watch Request.Path. It looks like it is right with a value of 'Ball Image.gif' instead of 'Ball%2520Image.gif'. With this being the case I'm not sure why it isn't working.

I feel like I have a super hack where I am having to disable everything without really understanding why nothing is working. So I guess my question is three fold

  1. Why did solution attempt 1 not take care of the problem?
  2. Why did solution 2 not take care of the problem?
  3. Why does my Request.Path look right in step 3 but it still doesn't work?

Any help anyone can provide would be greatly appreciated.

like image 782
Mark Rucker Avatar asked Jan 06 '12 22:01

Mark Rucker


People also ask

What is double URL encoding?

Double URI-encoding is a special type of double encoding in which data is URI-encoded twice in a row. It has been used to bypass authorization schemes and security filters against code injection, directory traversal, cross-site scripting (XSS) and SQL injection.

What is URL encoding give example?

All characters to be URL-encoded are encoded using a '%' character and a two-character hex value corresponding to their UTF-8 character. For example, 上海+中國 in UTF-8 would be URL-encoded as %E4%B8%8A%E6%B5%B7%2B%E4%B8%AD%E5%9C%8B .

How do you encode a URL?

URL Encoding (Percent Encoding) URLs can only be sent over the Internet using the ASCII character-set. Since URLs often contain characters outside the ASCII set, the URL has to be converted into a valid ASCII format. URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.


1 Answers

OK, after much searching of the internets and plenty of experimentation I think I finally understand what is going on. My main problem was a case of extreme confirmation bias. Everything I read said what I wanted to hear rather than what it actually said. I am going to summarize greatly the key points I needed to understand in order to answer my question.

  1. First, I needed to understand that IIS and ASP.Net are two different applications. What IIS does in a nutshell is receive a request, route that request to an application that handles it, gets the output from the handling application, and then sends the output from the application back to the requester. What ASP.Net does is receive the request from IIS, handle it, and then pass the response back to IIS. This is a huge over-generalization of the whole process but for my purposes here it is good enough.1

  2. Incoming ASP.Net requests have to pass through two gatekeepers. The IIS7 RequestFiltering module(configured in system.webserver/requestFiltering2), and then the ASP.Net HttpRuntime request filters(configured in system.web/httpRuntime3).

  3. The IIS RequestFiltering module is the only one that normalizes incoming requests and it only applies normalization ONE time. Again I repeat it only applies it ONE time. Even if <requestFiltering allowDoubleEscaping="true" /> it will still only apply normalization once. So that means '%2520' will be normalized to '%20'. At this point if allowDoubleEscaping is false IIS will not let the request through since '%20' could still be normalized. If, however, allowDoubleEscaping is set to true then IIS7 will pass off the request '%20' to the next gatekeeper, ASP.Net. This was the cause of the first error.

  4. The Asp.net filter is where the requestPathInvalidCharacters are checked. So now our '%20' is invalid because by default '%' is a part of requestPathInvalidCharacters. If we remove the '%' from that list we will make it through the second gatekeeper and ASP.Net will try to handle our request. This was the cause of the second error.

  5. Now ASP.net will try to convert our virtual path into a physical one on the server. Unfortunately, we still have a '%20' in our path instead of the ' ' we want so ASP.Net isn't able to find the resource we want and throws a "resource can't be found error". The reason the path looked right to me when I broke in my code is because I placed a watch on the Request.Url property. This property tries to be helpful by applying its own normalization in its ToString() method thus making our %20 look like the ' ' we want even though it isn't. This was the cause of the final error.

To make this work we could write our own custom module that receives the request after the first two gatekeepers and fully normalizes it before handing it off to ASP.Net. Doing this though would allow any character to come through as long as it was URL encoded. For example, we normally don't want to allow a '<' or a '>' in our paths since these can be used to insert tags into our code. As things work right now the < and > will not get past the ASP.Net filter since they are part of the requestPathInvalidCharacters. However, encoded as a %253C and a %253E they can if we open the first two gates and then normalize the request in our own custom module before handing it off to ASP.Net.

In conclusion, allowing %2520 to be fully normalized can't be done without creating a large security hole. If it were possible to tell the RequestFiltering module to fully normalize every request it receives before testing that request against the first two gatekeepers then it would be much safer but right now that functionality isn't available.

If I got anything wrong let me know and I hope this helps somebody.

like image 178
Mark Rucker Avatar answered Oct 27 '22 21:10

Mark Rucker