Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

url encoded character gets parsed wrongly by webflow/EL/JSF

when I submit the character Ö from a webpage the backend recieves Ã. The webpage is part of a Spring Webflow/JSF1.2/Facelets application. When I inspect the POST with firebug I see:

Content-Type: application/x-www-form-urlencoded 
Content-Length: 74 
rapport=krediet_aanvragen&fw1=0&fw2=%C3%96ZTEKIN&fw3=0&fw4=0&zoeken=Zoeken

The character Ö is encoded as %C3%96, using this table I can see that it is the correct hexadecimal representation of the UTF-8/Unicode character Ö. However when it reaches the backend the character is changed into Ã. Using the same table I can see there is some code somewhere that tries to interpret the C3 and the 96 separately (or as unicode \u notation). U+00C3 happens to be Ã, 96 is not a visible character so that explains that.

Now I know this is a typical case of an encoding mismatch, I just don't know where to look to fix this.

The webpage contains

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

When debugging I can see the library responsible for the wrong interpration is jboss-el 2.0.0.GA, which seems right because the value is parsed to the backend in a webflow expression:

<evaluate expression="rapportCriteria.addParameter('fw2', flowScope.fw2)" />

It is put onto the flowScope by:

<evaluate expression="requestParameters.fw2" result="flowScope.fw2"/>

Nevermind the convulated way of getting the form input into the backend, this is code that tries to integrate Webflow with BIRT reports...but I have the same sympton in other webapplications.

Any idea where I have to start looking?

like image 471
Nicolas Mommaerts Avatar asked Apr 07 '11 09:04

Nicolas Mommaerts


1 Answers

I can see that it is the correct hexadecimal representation of the UTF-8/Unicode character Ö. However when it reaches the backend the character is changed into Ã.

So the client side character encoding to encode the POST body is correct, but the server side character encoding to decode the POST body not. You need to create a Filter which does basically the following in doFilter() method

request.setCharacterEncoding("UTF-8");

and map it on URL pattern of interest. Spring also already provides one out the box, the CharacterEncodingFilter which does basically the above. All you need to do is to add it to the web.xml:

<filter>
    <filter-name>characterEncodingFilter</filter-name>
    <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
    <init-param>
        <param-name>encoding</param-name>
        <param-value>UTF-8</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
</filter>

<filter-mapping>
    <filter-name>characterEncodingFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

See also:

  • Unicode - How to get characters right? - JSP/Servlet requests - POST

The HTML meta header is by the way irrelevant in the issue, it's ignored when the page is served over HTTP. It's the HTTP response header which instructs the webbrowser in what charset it should display the response and to send the params back to the server. This is apparently already been set properly since the POST body is correctly encoded. The HTML meta header is only been used when the user saves the page to local disk and revisits it later from local disk.

like image 179
BalusC Avatar answered Nov 15 '22 05:11

BalusC