Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google's URL encoding?

Tags:

url

I have noticed that Google does not encode all special characters in the query part of the URL . For example:

Placing this string in Google's search: !@#$%^&*()

Yields this URL: https://www.google.com/#q=!%40%23%24%25^%26*()

Notice that the !, ^, *, ( , and ) are not encoded.

Some of the characters such as : or < are considered unsafe or reserved, yet Google doesn't encode them.

Can someone explain why Google does this, and if they have a reference document as to exactly what characters get encoded and which don't?

Thanks for any help!

like image 201
Josh Avatar asked Nov 12 '22 20:11

Josh


1 Answers

As documented here:

Some characters are not safe to use in a URL without first being encoded. Because a Google search request is made by using an HTTP URL, the search request must follow URL conventions, including character encoding, where necessary.

The HTTP URL syntax defines that only alphanumeric characters, the special characters $-_.+!*'(), and the reserved characters ;/?:@=& can be used as values within an HTTP URL request. Since reserved characters are used by the search engine to decode the URL, and some special characters are used to request search features, then all non-alphanumeric characters used as a value to an input parameter must be URL-encoded.

To URL-encode a string:

Replace space characters with a "+" character Replace each non-alphanumeric character by its hexadecimal ASCII value, in the format of a "%" character followed by two hexadecimal digits. (Such an ASCII value may be referred to as an escape code.)

Some input parameters require that the values passed to Google search are double-URL-encoded. This requirement means that you must apply the URL encoding to the string twice in succession to generate the final value.

like image 87
revo Avatar answered Dec 05 '22 23:12

revo