Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's the difference between "URIEncoding" of Tomcat, Encoding Filter and request.setCharacterEncoding

There may be many ways to solve encoding problem:

  • Encoding filter like Spring MVC UTF-8 Encoding

  • Setting URIEncoding=UTF-8 in server.xml of Tomcat , like http://struts.apache.org/release/2.1.x/docs/how-to-support-utf-8-uriencoding-with-tomcat.html.

  • request.setCharacterEncoding( utf-8 )

Today, I have a problem that path param is not decoded well like

@ResponseBody
@RequestMapping(value="/context/method/{key}",method=RequestMethod.GET,produces = "application/json;charset=utf-8")
public String method(@PathVariable String key){

    logger.info("key="+key+"------------");
}

I can see that the key is decoded bad! If I pass a word "新浪" from the front end, it will become "æ°æµª". I write the below code to examine if the server is decoding this with "ISO-8859-1":

public static void main(String args[]) throws UnsupportedEncodingException{
    String key="新浪";
    byte[] bytes=key.getBytes("UTF-8");
    String decode=new String(bytes,"ISO-8859-1");
    System.out.println(decode);
}

And it comes out with the same output "æ°æµª". so indeed, the path variable is decoded with ISO-8859-1.

And then I try to add a filter to my web.xml to solve this problem:

  <filter>
    <filter-name>encodingFilter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
        <param-name>encoding</param-name>
        <param-value>UTF-8</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
 </filter>

  <filter-mapping>
    <filter-name>encodingFilter</filter-name>
    <url-pattern>/*</url-pattern>
 </filter-mapping> 

But the same garbled.

Until I set below to my server.xml

<Connector connectionTimeout="20000" port="8080" protocol="HTTP/1.1" redirectPort="8443"
           URIEncoding="UTF-8" useBodyEncodingForURI="true"   ----Here is Added
/>

And it works for this even I remove the filter.

But I am still very confusing about the encoding issue. And besides , this is only GET method, if it is POST method, I guess the solution will probably be different

Can anybody please explain that what difference encoding solution should we take for what kind of problem ?

Thank you!

like image 496
JaskeyLam Avatar asked Nov 15 '14 10:11

JaskeyLam


People also ask

What is the use of uriencoding in Tomcat?

URIEncoding is used to specify encoding of the URI, therefore it affects GET parameters useBodyEncodingForURI="true" tells Tomcat to use encoding configured for request body when decoding URIs. So, as far as I understand, if you set CharacterEncodingFilter and useBodyEncodingForURI="true" then you don't need URIEncoding.

What does usebodyencodingforuri() do in Tomcat?

That is, it affects encoding of POST request parameters, etc, but doesn't affect encoding of GET parameters URIEncoding is used to specify encoding of the URI, therefore it affects GET parameters useBodyEncodingForURI="true" tells Tomcat to use encoding configured for request body when decoding URIs.

Is it possible to change the request character encoding in Tomcat?

It is also possible to define such a filter in the Tomcat installation configuration file conf/web.xml, which would set the request character encoding across all web applications without the need for any web.xml modifications.

What is the default encoding for percent-encoding URI characters?

Although the URI specification does not mandate a default encoding for percent-encoded octets, it recommends UTF-8 especially for new URI schemes, and most modern user agents have settled on UTF-8 for percent-encoding URI characters. ISO-8859-1 and ASCII are compatible for character codes 0x20 to 0x7E, so they are often used interchangeably.


1 Answers

  • CharacterEncodingFilter configures encoding of request body. That is, it affects encoding of POST request parameters, etc, but doesn't affect encoding of GET parameters

  • URIEncoding is used to specify encoding of the URI, therefore it affects GET parameters

  • useBodyEncodingForURI="true" tells Tomcat to use encoding configured for request body when decoding URIs. So, as far as I understand, if you set CharacterEncodingFilter and useBodyEncodingForURI="true" then you don't need URIEncoding.

In practice, you need to two things to solve possible problems with encoding of parameters:

  • CharacterEncodingFilter for POST requests

  • URIEncoding (or useBodyEncodingForURI="true") for GET requests

like image 163
axtavt Avatar answered Oct 23 '22 17:10

axtavt