Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a valid URL query string?

What characters are allowed in an URL query string?

Do query strings have to follow a particular format?

like image 600
Aran Mulholland Avatar asked Nov 14 '12 05:11

Aran Mulholland


3 Answers

Per https://www.rfc-editor.org/rfc/rfc3986

In section 2.2 Reserved Characters, the following characters are listed:

reserved = gen-delims / sub-delims

gen-delims = “:” / “/” / “?” / “#” / “[” / “]” / “@”

sub-delims = “!” / “$” / “&” / “’” / “(” / “)” / “*” / “+” / “,” / “;” / “=”

The spec then says:

If data for a URI component would conflict with a reserved character’s purpose as a delimiter, then the conflicting data must be percent-encoded before the URI is formed.

Next, in section 2.3 Unreserved Characters, the following are listed:

unreserved = ALPHA / DIGIT / “-” / “.” / “_” / “~”

like image 171
Steven Avatar answered Oct 27 '22 01:10

Steven


Wikipedia has your answer: http://en.wikipedia.org/wiki/Query_string

"URL Encoding: Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character # can be used to further specify a subsection (or fragment) of a document; the character = is used to separate a name from a value. A query string may need to be converted to satisfy these constraints. This can be done using a schema known as URL encoding.

In particular, encoding the query string uses the following rules:

  • Letters (A-Z and a-z), numbers (0-9) and the characters '.','-','~' and '_' are left as-is
  • SPACE is encoded as '+' or %20[citation needed]
  • All other characters are encoded as %FF hex representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)

The octet corresponding to the tilde ("~") character is often encoded as "%7E" by older URI processing implementations; the "%7E" can be replaced by"~" without changing its interpretation. The encoding of SPACE as '+' and the selection of "as-is" characters distinguishes this encoding from RFC 1738."

Regarding the format, query strings are name value pairs. The ? separates the query string from the URL. Each name value pair is separated by an ampersand (&) while the name (key) and value is separated by an equals sign (=). eg. http://domain.com?key=value&secondkey=secondvalue

Under Structure in the Wikipedia reference I provided:

  • The question mark is used as a separator and is not part of the query string.
  • The query string is composed of a series of field-value pairs
  • Within each pair, the field name and value are separated by an equals sign, '='.
  • The series of pairs is separated by the ampersand, '&' (or semicolon, ';' for URLs embedded in HTML and not generated by a ...; see below).
  • W3C recommends that all web servers support semicolon separators in addition to ampersand separators[6] to allow application/x-www-form-urlencoded query strings in URLs within HTML documents without having to entity escape ampersands.
like image 45
Clarice Bouwer Avatar answered Oct 27 '22 01:10

Clarice Bouwer


This link has the answer and formatted values you all need.

https://perishablepress.com/url-character-codes/

For your convenience, this is the list:

<     %3C
>     %3E
#     %23
%     %25
{     %7B
}     %7D
|     %7C
\     %5C
^     %5E
~     %7E
[     %5B
]     %5D
`     %60
;     %3B
/     %2F
?     %3F
:     %3A
@     %40
=     %3D
&     %26
$     %24
+     %2B
"     %22
space     %20
like image 28
Silvester Tony Avatar answered Oct 27 '22 01:10

Silvester Tony