Tomcat does not encode correctly String literals that contain unicode characters. The problem occurs at a Linux server but not on my development machine (Windows). It affects ONLY String literals (not Strings read from DB or from file!!!).
URIEncoding="utf-8"
at the Connector tag (server.xml).Nothing of the above works. Any ideas on what I might be missing?
public class Test extends HttpServlet {
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
resp.setCharacterEncoding("utf-8");
resp.setContentType("text/plain;");
Writer w = resp.getWriter();
w.write("Μαλακία Latin"); //Some unicode characters
w.close();
}
The above shows this at the browser. Îλληνικά Latin
UTF-8 represents a variable-width character encoding that uses between one and four eight-bit bytes to represent all valid Unicode code points. A code point can represent single characters, but also have other meanings, such as for formatting.
UTF-8 is a multibyte encoding that can represent any Unicode character. ISO 8859-1 is a single-byte encoding that can represent the first 256 Unicode characters. Both encode ASCII exactly the same way.
A Java String is internally always encoded in UTF-16 - but you really should think about it like this: an encoding is a way to translate between Strings and bytes.
You can force the encoding of files when javac reads them by passing in -encoding 'utf-8' or -encoding 'iso-8859-1' when compiling. Just make sure that it matches whatever encoding your .java files are actually encoded as.
http://docs.oracle.com/javase/6/docs/technotes/tools/windows/javac.html
-encoding encoding Set the source file encoding name, such as EUC-JP and UTF-8. If -encoding is not specified, the platform default converter is used.
Try setting the file.encoding system property e.g. -Dfile.encoding=utf-8
on the Linux JVM command line
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With