Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is no encoding set in response by Tomcat? How can I deal with it?

I had recently a problem with encoding of websites generated by servlet, that occurred if the servlets were deployed under Tomcat, but not under Jetty. I did a little bit of research about it and simplified the problem to the following servlet:

public class TestServlet extends HttpServlet implements Servlet {
    @Override
    public void service(HttpServletRequest request, HttpServletResponse response) throws IOException {
        response.setContentType("text/plain");
        Writer output = response.getWriter();
        output.write("öäüÖÄÜß");
        output.flush();
        output.close();
    }
}

If I deploy this under Jetty and direct the browser to it, it returns the expected result. The data is returned as ISO-8859-1 and if I take a look into the headers, then Jetty returns:

Content-Type: text/plain; charset=iso-8859-1

The browser detects the encoding from this header. If I deploy the same servlet in Tomcat, the browser shows up strange characters. But Tomcat also returns the data as ISO-8859-1, the difference is, that no header tells about it. So the browser has to guess the encoding, and that goes wrong.

My question is, is that behaviour of Tomcat correct or a bug? And if it is correct, how can I avoid this problem? Sure, I can always add response.setCharacterEncoding("UTF-8"); to the servlet, but that means I set a fixed encoding, that the browser might or might not understand. The problem is more relevant, if no browser but another service accesses the servlet. So how I should deal with the problem in the most flexible way?

like image 345
Dishayloo Avatar asked Mar 24 '10 16:03

Dishayloo


People also ask

How to set charset in http header in Java?

Java Servlets.resource. setContentType ("text/html;charset=utf-8");


1 Answers

If you don't specify an encoding, the Servlet specification requires ISO-8859-1. However, AFAIK it does not require the container to set the encoding in the content type, at least not if you set it to "text/plain". This is what the spec says:

Calls to setContentType set the character encoding only if the given content type string provides a value for the charset attribute.

In other words, only if you set the content type like this

response.setContentType("text/plain; charset=XXXX")

Tomcat is required to set the charset. I haven't tried whether this works though.

In general, I would recommend to always set the encoding to UTF-8 (as it causes the least amount of trouble, at least in browsers) and then, for text/plain, state the encoding explicitly, to prevent browsers from using a system default.

like image 157
Tim Jansen Avatar answered Sep 19 '22 03:09

Tim Jansen