Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

URL Encoding in Google Chrome

Does anyone know what encoding Google Chrome uses for encoding the URL?

Encoding happens when we try to copy the URL from Google's search box (Omnibox).

I have pasted the following URL:

www.bing.com/search?q=이윤희&go=&qs=n&form=QBLH&filt=all&pq=이윤희&sc=0-0&sp=-1&sk=

into Google search then copy the same URL back from the search box and it becomes this:

http://www.bing.com/search?q=%EC%9D%B4%EC%9C%A4%ED%9D%AC&go=&qs=n&form=QBLH&filt=all&pq=%EC%9D%B4%EC%9C%A4%ED%9D%AC&sc=0-0&sp=-1&sk=

I want to know what encoding they are using.

like image 230
sourabh kasliwal Avatar asked Jan 31 '14 10:01

sourabh kasliwal


People also ask

What does %20 in a URL mean?

A space is assigned number 32, which is 20 in hexadecimal. When you see “%20,” it represents a space in an encoded URL, for example, http://www.example.com/products%20and%20services.html.

How do I find the URL encoding?

So you can test if the string contains a colon, if not, urldecode it, and if that string contains a colon, the original string was url encoded, if not, check if the strings are different and if so, urldecode again and if not, it is not a valid URI.

Does browser automatically encode URL?

Browsers automatically encode the URL i.e. it converts some special characters to other reserved characters and then makes the request.


1 Answers

That's standard percent URL encoding, in this case of UTF-8 encoded text. A URL cannot contain non-ASCII characters (actually, a subset thereof, different subsets for different parts of the URL). You cannot actually have "이윤희" in a URL. To embed arbitrary characters, you can percent encode them. This simply takes a single byte and encodes its hex value as %xx. The UTF-8 byte representation of "이윤희" is EC 9D B4 EC 9C A4 ED 9D AC, which is exactly what you're seeing in the URL.

The URL is always this way, it's not Chrome doing it when you copy. On the contrary, if the URL displays as www.bing.com/search?q=이윤희&..., that's Chrome being nice and displaying the URL decoded for you.

See What every web developer must know about URL encoding.

In PHP this can be replicated with rawurlencode:

echo rawurlencode('이윤희'); // (assuming UTF-8 encoded source code)
like image 113
deceze Avatar answered Sep 21 '22 18:09

deceze