IIS and ASP.NET (MVC) has some glitches when working with urls with %-encoding in the path (not the query-string; the query-string is fine). How can I get around this? i.e. how can I get the actual URL that was requested?
For example, if I navigate to /x%3Fa%3Db
and (separately) to /x?a=b
- both of them report the .Request.Url
as /x?a=b
- because the encoded data in the path is reported incorrectly.
URLs cannot contain spaces. URL encoding normally replaces a space with a plus (+) sign or with %20.
Another interesting oddity is that when you copy URLs out of Firefox or Chrome they are URL encoded, which can be very annoying. To prevent this simply type a character in the URL and erase it, before you copy the URL.
Why do we need to encode? URLs can only have certain characters from the standard 128 character ASCII set. Reserved characters that do not belong to this set must be encoded. This means that we need to encode these characters when passing into a URL.
The way I've tacked this is to look at the underlying server-variables; the URL
variable contains a decoded value; the QUERY_STRING
variable contains the still-encoded query. We can't just call encode on the URL
part, because that also contains the orignal /
etc in their original form - if we blindly encode the entire thing we'll get unwanted %2f
values; however, can pull it apart and spot problematic cases:
private static readonly Regex simpleUrlPath = new Regex("^[-a-zA-Z0-9_/]*$", RegexOptions.Compiled);
private static readonly char[] segmentsSplitChars = { '/' };
// ^^^ avoids lots of gen-0 arrays being created when calling .Split
public static Uri GetRealUrl(this HttpRequest request)
{
if (request == null) throw new ArgumentNullException("request");
var baseUri = request.Url; // use this primarily to avoid needing to process the protocol / authority
try
{
var vars = request.ServerVariables;
var url = vars["URL"];
if (string.IsNullOrEmpty(url) || simpleUrlPath.IsMatch(url)) return baseUri; // nothing to do - looks simple enough even for IIS
var query = vars["QUERY_STRING"];
// here's the thing: url contains *decoded* values; query contains *encoded* values
// loop over the segments, encoding each separately
var sb = new StringBuilder(url.Length * 2); // allow double to be pessimistic; we already expect trouble
var segments = url.Split(segmentsSplitChars);
foreach (var segment in segments)
{
if (segment.Length == 0)
{
if(sb.Length != 0) sb.Append('/');
}
else if (simpleUrlPath.IsMatch(segment))
{
sb.Append('/').Append(segment);
}
else
{
sb.Append('/').Append(HttpUtility.UrlEncode(segment));
}
}
if (!string.IsNullOrEmpty(query)) sb.Append('?').Append(query); // query is fine; nothing needing
return new Uri(baseUri, sb.ToString());
}
catch (Exception ex)
{ // if something unexpected happens, default to the broken ASP.NET handling
GlobalApplication.LogException(ex);
return baseUri;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With