I am trying to submit a form, which has UTF8 characters inside it. The form looks like this:
<form id="workflowPersistForm" accept-charset="UTF-8" method="post" action="/workflow-next">>
<input id="stateGlobal" type="hidden" value=" お問い合わせ" name="state">
</form>
My server is a spring based. My web.xml already has the Encoding Filter:
<filter>
<filter-name>EncodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>
The problem is that the UTF-8 characters are getting messed up somewhere. I put a break point just at the start of controller, and the characters are messed up at that point itself. Also, if I generate UTF8 characters inside Controller, it gets rendered correctly in the browser. Just that on form post, the controller doesn't receive the characters properly.
Any idea what I might be doing wrong?
Edit: Looks like, in the new page data is not messed up, but its double encoded. I am unable to understand why it is double encoded.
Edit 2: When I change the form to get instead of post, everything works perfectly. I have no idea what post is breaking.
Looks like browsers don’t send the charset as part of Content-Type in request headers (even when accept-charset on form is set) and Tomcat deals with body of such requests as Latin-1 ( http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q1 ).
So at a later point this might have been decoded as Latin-1 and encoded as UTF-8 resulting in garbled up characters.
Moving CharacterEncodingFilter to the top and forcing the encoding to be set as UTF-8 solved the problem.
Do you have a filter-mapping entry in your web.xml for EncodingFilter?
<filter-mapping>
<filter-name>EncodingFilter</filter-name>
<url-pattern>*</url-pattern>
</filter-mapping>
I would suggest you remove the CharacterEncodingFilter, which may itself be the cause of double encoding.
To debug the situtation, you should first check if the browser is posting the data correctly. Use Firebug (for Firefox) or developer tools on Chrome (F12)
Most likely, the problem is at the server side. Which server do you use? If you use Tomcat, you need to set the CharsetEncoding to UTF-8 on the Connector element in server.xml
Update 1:
It looks very likely that the problem is the forceEncoding that you are setting. As per the docs
This filter can either apply its encoding if the request does not already specify an encoding, or enforce this filter's encoding in any case ("forceEncoding"="true")
When you do a get, there is no encoding specified, so it makes sense that it works.
However when you do the POST, the encoding is already applied and then (it seems) is applied again because of the forceEncoding=true
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With