Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

encoding of query string parameters in IE10

I got a request from a customer that he wants to be able to type the query string of my web service with parameters in the IE10 address bar and get the service results. The parameters include string in Hebrew, like:

http://mywebsite.com/service.asmx/foo?param1=123&param2=מחרוזתבעברית

It seems to me that that IE10 won't encode the query string parameters - every non-ASCII character that goes after the ? mark would be turned to '3f' byte, though it does encode what goes before the ? mark - the url itself.

For example, if i try to reach the url (the parameter is fictional, url is not, and I have no connection with the site)

http://www.shlomo.co.il/pageshe/sales/רכב-למכירה.asp?param=פאראם 

and look in wireshark for the bytes I send to the server, it shows me

wireshark output

You can see it does substitute the hebrew part of the URL with urlencoded string, but substitutes the hebrew parameters with ?????, which are '3f's.

The same string in chrome would be encoded in it's entirety:

GET http://www.shlomo.co.il/pageshe/sales/%D7%A8%D7%9B%D7%91-%D7%9C%D7%9E%D7%9B%D7%99%D7%A8%D7%94.asp?param=%D7%A4%D7%90%D7%A8%D7%90%D7%9D HTTP/1.1

I tried it on machines with win7/IE10 and winXPheb/IE8.

My IE settings are (especially checked the "Always show encoded addresses option" to see if it helps and restarted, but made no difference):

enter image description here

I tried to search around for any info about the issue, but didn't find much of it.

My questions are:

  • Is it indeed like this, or am I missing something?
  • Is this behavior documented anywhere?
  • Are there any settings in IE/Win which enable the parameters encoding.

p.s. Sure if I was developing the client/web ui, I would simply urlencode my query, but my request from customer was exactly to paste the query to IE address bar, that's why I'm interested in this specific behavior.

Thanks.

like image 202
alex440 Avatar asked Aug 13 '13 22:08

alex440


People also ask

What is URL encoded query string?

URL Encoding is used when placing text in a query string to avoid it being confused with the URL itself. It is normally used when the browser sends form data to a web server. URL Encoding replaces “unsafe” characters with '%' followed by their hex equivalent.

What is a query string parameter?

What are query string parameters? Query string parameters are extensions of a website's base Uniform Resource Locator (URL) loaded by a web browser or client application. Originally query strings were used to record the content of an HTML form or web form on a given page.

What is the difference between URL parameters and query strings?

Parameters are key-value pairs that can appear inside URL path, and start with a semicolon character ( ; ). Query string appears after the path (if any) and starts with a question mark character ( ? ). Both parameters and query string contain key-value pairs.


1 Answers

Yes, your observation of the behavior is accurate. Internet Explorer 10 and below follow a complicated algorithm for encoding the URL. This was allegedly updated in Internet Explorer 11, but I've found that the new option doesn't seem to work.

The "Always show encoded addresses option" concerns whether PunyCode is shown for IDN hostnames, and does not impact the query string. Send UTF-8 URLs mostly applies to the encoding of the path, although it can also affect other codepaths

The behavior isn't fully documented anywhere. I'd meant to write a full post on my IEInternals blog about it but ended up moving on from Microsoft before doing so. There's a partial explanation in this blog post.

Yes, there are settings that impact the behavior. The Send UTF-8 URLs checkbox inside Tools > Internet Options > Advanced is one of the variables that determines how URLs are sent, but the option does not blindly do what it implies (it only UTF-8 encodes the path, not the query string). Other variables involved include:

  1. Where the URL was typed (e.g. address bar vs. Start > Run, etc)
  2. What the system's ANSI codepage is (e.g. what locale the OS uses as default)
  3. The charset of the currently loaded page in the browser

As a consequence of these variables, you cannot reliably use URLs which are not properly encoded (e.g. %-escaped UTF8) in Internet Explorer.

like image 150
EricLaw Avatar answered Sep 22 '22 12:09

EricLaw