Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

encodeURIComponent() difference with browsers and [ä ö å] characters

Tags:

javascript

I have a problem with encodeURIComponent() as it seems to behave differently than browsers (tested with Chrome and Firefox):

  • encodeURIComponent('ä') returns %C3%A4
  • escape('ä') returns %E4
  • Chrome/Firefox encodes ä in x-www-form-urlencoded forms as %E4

So, why does encodeURIComponent behave differently than all the others (mainly browsers)? This actually causes problems as some websites don't understand what I'm trying to feed them. The website in question is http://verkkopalvelu.vrk.fi/Nimipalvelu/default.asp?L=1 (click "Etunimihaku" as it is iframe based).

Is encodeURIComponent broken and how should this situation be corrected? What would be the correct way to encode characters like ä ö å? escape() seems to encode the same as those browsers, but escape() is deprecated.

I tested the browsers with Fiddler and also the Console/Network tab shows the encoding as %E4 when I submit a form. Also a test link here: A http://jsfiddle.net/tcyfktvg/1/

like image 319
ile Avatar asked Oct 19 '22 21:10

ile


1 Answers

encodeURIComponent() is not broken. It encodes the chars using UTF-8 char set. Always. (ECMAScript 3rd Edition (ECMA-262) page 82)

escape() uses Unicode for encoding (ECMAScript 1st Edition (ECMA-262) page 60). If the unicode code is < 256 then the simple two letter representation is used, as you see for "ä". If the unicode code is >= 256, then the extended four char representation with a leading "u" is used. Example: escape("겧") == "%uACA7".

The problem arises when a http server receives a encoded URL. It has to decode it. But the URL itself doesn't tell which encoding was used to create it.

This URL: http://server/%C3%A4 could be a http://server/ä if it was encoded by encodeURIComponent() (using UTF-8), but it could also be a http://server/ä encoded by escape() (using Unicode):

encodeUriComponent("ä") == "%C3%A4"
escape("ä") == "%C3%A4"

It's up to the configuration of the server which encoding it will use to decode the URL. So here's the solution to your problem: know which URL encoding the http server expects and choose the appropriate encoding method.

like image 58
Eduard Wirch Avatar answered Oct 30 '22 23:10

Eduard Wirch