I have a servlet which receive some parameter from the client ,then do some job. And the parameter from the client is Chinese,so I often got some invalid characters in the servet. For exmaple: If I enter
http://localhost:8080/Servlet?q=中文&type=test
Then in the servlet,the parameter of 'type' is correct(test),however the parameter of 'q' is not correctly encoding,they become invalid characters that can not parsed.
However if I enter the adderss bar again,the url will changed to :
http://localhost:8080/Servlet?q=%D6%D0%CE%C4&type=test
Now my servlet will get the right parameter of 'q'.
UPDATE
BTW,it words well when I send the form with post. WHen I send them in the ajax,for example:
url="http://..q='中文',
xmlhttp.open("POST",url,true);
Then the server side also get the invalid characters.
It seems that just when the Chinese character are encoded like %xx,the server side can get the right result.
That's to say http://.../q=中文 does not work,
http://.../q=%D6%D0%CE%C4 work.
But why "http://www.google.com.hk/search?hl=zh-CN&newwindow=1&safe=strict&q=%E4%B8%AD%E6%96%87&btnG=Google+%E6%90%9C%E7%B4%A2&aq=f&aqi=&aql=&oq=&gs_rfai=" work?

Ensure that the encoding of the page with the form itself is also UTF-8 and ensure that the browser is instructed to read the page as UTF-8. Assuming that it's JSP, just put this in very top of the page to achieve that:
<%@ page pageEncoding="UTF-8" %>
Then, to process GET query string as UTF-8, ensure that the servletcontainer in question is configured to do so. It's unclear which one you're using, so here's a Tomcat example: set the URIEncoding attribute of the <Connector> element in /conf/server.xml to UTF-8.
<Connector URIEncoding="UTF-8">
For the case that you'd like to use POST, then you need to ensure that the HttpServletRequest is instructed to parse the POST request body using UTF-8.
request.setCharacterEncoding("UTF-8");
Call this before you access the first parameter. A Filter is the best place for this.
Using non-ASCII characters as GET parameters (i.e. in URLs) is generally problematic. RFC 3986 recommends using UTF-8 and then percent encoding, but that's AFAIK not an official standard. And what you are using in the case where it works isn't UTF-8!
It would probably be safest to switch to POST requests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With