Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is http:///example.org (with triple slash) treated as a valid URL by Firefox and webkit?

When the URL http:///example.org is opened in Firefox or webkit-based browsers, it opens http://example.org. I wonder if this is a valid behavior, i.e. if the extra slash should be stripped and example.org treated as an authority component. I read the specification (RFC 3986), and I got the impression that the authority component of such an URI should be considered empty. Some other HTTP clients such as curl or links2 won't resolve the URL.

Is this a bug in the browsers, or a valid behavior in accordance with the RFC? Edit: Or an intended feature, in order to make browsers more user-friendly?

like image 696
peter Avatar asked Mar 31 '14 21:03

peter


1 Answers

The specification of the "http" protocol requires a hostname in the URI. See http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.2.2. So the string http:///foo is not a valid http URI, and the browser is faced with the question of what to do with the invalid URI string.

What Gecko (Firefox) does is that its URI parser actually has scheme-dependent behavior where it will assume what you meant based on the URI scheme and do certain fixups. See the comments at http://mxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsIStandardURL.idl?rev=f4157e8c4107&mark=20-23,28-31,36-39#20. "http" URIs are created with the URLTYPE_AUTHORITY flag, which leads to the behavior you see (per line 31 of nsIStandardURL.idl).

Note that the current attempt to standardize how URIs should be parsed in web pages and by web browsers, at http://url.spec.whatwg.org/ and has a whitelist of schemes at http://url.spec.whatwg.org/#relative-scheme that have behavior like this. If you step through the parsing algorithm for schemes in that whitelist, once you see the ':' you enter the state at http://url.spec.whatwg.org/#authority-first-slash-state which basically treats 0 or more slashes as all being equivalent to "//" and goes on to parse the thing following the slashes as the "authority" section of the URL.

like image 199
Boris Zbarsky Avatar answered Nov 14 '22 23:11

Boris Zbarsky