Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why the character is corrupted when use request.getParameter() in java? [duplicate]

I have such a link in JSP page with encoding big5 http://hello/world?name=婀ㄉ And when I input it in browser's URL bar, it will be changed to something like http://hello/world?name=%23%24%23 And when we want to get this parameter in jsp page, all the characters are corrupted.

And we have set this: request.setCharacterEncoding("UTF-8"), so all the requests will be converted to UTF8.

But why in this case, it doesn't work ? Thanks in advance!.

like image 736
MemoryLeak Avatar asked Sep 02 '09 04:09

MemoryLeak


1 Answers

When you enter the URL in browser's address bar, browser may convert the character encoding before URL-encoding. However, this behavior is not well defined, see my question,

Handling Character Encoding in URI on Tomcat

We mostly get UTF-8 and Latin-1 on newer browsers but we get all kinds of encodings (including Big5) in old ones. So it's best to avoid non-ASCII characters in URL entered by user directly.

If the URL is embedded in JSP, you can force it into UTF-8 by generating it like this,

String link = "http://hello/world?name=" + URLEncoder.encode(name, "UTF-8");

On Tomcat, the encoding needs to be specified on Connector like this,

<Connector port="8080" URIEncoding="UTF-8"/>

You also need to use request.setCharacterEncoding("UTF-8") for body encoding but it's not safe to set this in servlet because this only works when the parameter is not processed but other filter or valve may trigger the processing. So you should do it in a filter. Tomcat comes with such a filter in the source distribution.

like image 120
ZZ Coder Avatar answered Sep 24 '22 18:09

ZZ Coder