Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reproduce url encoding in form GET request, from javascript, when accents and such are involved?

Say I have a simple form like this:

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  </head>
  <body>
    <div id="search">
      <form method="GET" action="/super-action">
        <input type="text" name="q" />
      </form>
    </div>
  </body>
</html>

with an input like: @tags "Cinéma Whatever"

a form GET request results in a url that looks like: /super-action?q=%40tags+"Cinéma+Whatever"

Now I want to reproduce that with javascript in location.hash, with a pound sign instead of a slash, like: /super-action#q=%40tags+"Cinéma+Whatever"

But with the available functions, I get there results:

  • escape(input): @tags%20%22Cin%E9ma%20Whatever%22
  • encodeURI(input): @tags%20%22Cin%C3%A9ma%20Whatever%22
  • encodeURIComponent(input): %40tags%20%22Cin%C3%A9ma%20Whatever%22
  • $(form).serialize(), without q=: %40tags+%22Cin%C3%A9ma+Whatever%22

The question: How can I make an input value, like @tags "Cinéma Whatever", look like what a form GET request would do: %40tags+"Cinéma+Whatever" using javascript?

like image 964
bksunday Avatar asked Oct 08 '22 20:10

bksunday


1 Answers

According to RFC 1738, /super-action?q=%40tags+"Cinéma+Whatever" is not valid inside an URL:

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

That means that you can't produce a valid URL with that substring in it. You must encode the special characters " and é, otherwise the resulting string is not a URL.

The reason why you think this is valid might be that your browser plays tricks on you: It could be displaying the URL in partially encoded form to make it easier to read in the address bar. Try using a protocol analyzer like Wireshark to inspect the actual URL path sent across the network.

UPDATE: I quickly confirmed this, the HTTP header sent in reaction to a form submit is the following:

GET /?q=%40tags+%22Cin%C3%A9ma+Whatever%22 HTTP/1.1

So it is first UTF-8 encoded and then URL encoded.

like image 66
Niklas B. Avatar answered Oct 10 '22 11:10

Niklas B.