The following line of code gives an exception. Is this a bug in the framework? If not what approach could I take instead?
It seems to be the ":" (colon) that causes in the issue, however I do see such a URI working on production websites ok (i.e. seems to be a valid URI in the real world)
Uri relativeUri = new Uri("http://test.com/asdf").MakeRelativeUri(new Uri("http://test.com/xx:yy"));
// gives => System.UriFormatException: A relative URI cannot be created because the
// 'uriString' parameter represents an absolute URI
Uri relativeUri = new Uri("http://test.com/asdf").MakeRelativeUri(new Uri("http://test.com/xxyy"));
// this works - removed the colon between the xx and yy
PS. Specifically can I ask given the above is the case, what .NET class/method could I use (noting I am parsing a HTML page from the web) to take (a) the page URI and (b) the relative string from a HTML HREF argument [e.g. would have been "/xx:yy" in this case] and return the valid URI that could be used to address that resource?
In other words how do I mimic the behavior of a browser that translates the HREF and the page URI to produce the URI it uses to go to that resource when you click on it.
I consider it a bug.
RFC1738 says that :
(amongst other characters) may be reserved for special meaning within a scheme. However the http
scheme does not reserve it in the path part
Within the <path> and <searchpart> components, "/", ";", "?" are reserved.
(Not :
.)
hsegment = *[ uchar | ";" | ":" | "@" | "&" | "=" ]
So, http://test.com/xx:yy
is a valid URI. The newer RFC3968 agrees:
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
However of course, relativised against http://test.com/asdf
, the resultant xx:yy
would be an absolute URI and not a valid relative URI:
path-noscheme = segment-nz-nc *( "/" segment )
segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
; non-zero-length segment without any colon ":"
So MakeRelativeUri
is kind of right to report there's a problem, but really it should be fixing it automatically by encoding the :
that is valid in an absolute URI to a %3A
that is valid in the first segment of a relative URI.
I would generally try to avoid MakeRelativeUri
in favour of root-relative URIs, which are easier to extract and don't have this problem (/xx:yy
is OK).
Colons play a special role in URLs - to denote a port for instance and are therefor 'reserved' (see here).
URLs use some characters for special use in defining their syntax. When these characters are not used in their special role inside a URL, they need to be encoded
So, the colon should be escaped.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With