Just say I have the following url that has a query string parameter that's an url:
http://www.someSite.com?next=http://www.anotherSite.com?test=1&test=2
Should I url encode the next
parameter? If I do, who's responsible for decoding it - the web browser, or my web app?
The reason I ask is I see lots of big sites that do things like the following
http://www.someSite.com?next=http://www.anotherSite.com/another/url
In the above, they don't bother encoding the next
parameter because I'm guessing, they know it doesn't have any query string parameters itself. Is this ok to do if my next
url doesn't include any query string parameters as well?
Why do we need to encode? URLs can only have certain characters from the standard 128 character ASCII set. Reserved characters that do not belong to this set must be encoded. This means that we need to encode these characters when passing into a URL.
Query parameters are a defined set of parameters attached to the end of a url. They are extensions of the URL that are used to help define specific content or actions based on the data being passed.
Characters such as / , ? , : , @ , and & are all reserved and must be encoded. For example & is reserved for use as a query string delimiter. : is also reserved to delimit host/port components and user/password.
RFC 2396 sec. 2.2 says that you should URL-encode those symbols anywhere where they're not used for their explicit meanings; i.e. you should always form targetUrl + '?next=' + urlencode(nextURL)
.
The web browser does not 'decode' those parameters at all; the browser doesn't know anything about the parameters but just passes along the string. A query string of the form http://www.example.com/path/to/query?param1=value¶m2=value2
is GET-requested by the browser as:
GET /path/to/query?param1=value¶m2=value2 HTTP/1.1 Host: www.example.com (other headers follow)
On the backend, you'll need to parse the results. I think PHP's $_REQUEST
array will have already done this for you; in other languages you'll want to split over the first ?
character, then split over the &
characters, then split over the first =
character, then urldecode both the name and the value.
According to RFC 3986:
The query component is indicated by the first question mark ("?") character and terminated by a number sign ("#") character or by the end of the URI.
So the following URI is valid:
http://www.example.com?next=http://www.example.com
The following excerpt from the RFC makes this clear:
... as query components are often used to carry identifying information in the form of "key=value" pairs and one frequently used value is a reference to another URI, it is sometimes better for usability to avoid percent-encoding those characters.
It is worth noting that RFC 3986 makes RFC 2396 obsolete.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With