Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Encoding for project set to UTF-8, default charset returns windows-1252

I've ran into an issue with encoding. Not sure if it's IDE related but I'm using NetBeans 7.4. I got this piece of code in my J2EE project:

    String test = "kukuřičné";
    System.out.println(new String(test.getBytes("UTF-8"))); // should display ok
    System.out.println(new String(test.getBytes("ISO-8859-1")));
    System.out.println(new String(test.getBytes("UTF-16")));
    System.out.println(new String(test.getBytes("US-ASCII")));
    System.out.println(new String(test.getBytes("windows-1250")));
    System.out.println(test); // should display ok

And when I run it, it never displays properly. UTF-8 should be able to print that out ok but it doesn't. Also when I tried:

    System.out.println(Charset.defaultCharset());

it returned windows-1252. The project is set to UTF-8 encoding. I've even tried resaving this specific java file in UTF-8 but it still doesn't display properly.

I've tried to create J2SE project on the other hand and when I run the same code it displays properly. Also the default charset returns UTF-8.

Both projects are set the UTF-8 encoding.

I want my J2EE project to act the same like the J2SE one. I didn't notice this issue until I updated my java to version 1.7.0_51-b13 but again I'm not sure if that is related.

I'm experiencing the same issue like this guy: http://forums.netbeans.org/ptopic37752.html

I've also tried setting the default encoding for the whole IDE: -J-Dfile.encoding=UTF-8 but it didn't help.

I've noticed an important fact. When I create a new web application it displays ok. When I create new Maven web application it displays incorrectly.

Found the same issue here: https://netbeans.org/bugzilla/show_bug.cgi?id=224526

I still haven't fixed it yet. There's still no solution working.

In my pom.xml the encoding is set properly, but it still shows windows-1252 in the end.

<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
like image 895
Lenymm Avatar asked Mar 17 '14 19:03

Lenymm


2 Answers

I've spend few hours trying to find the best solution.

First of all this is an issue of maven which picks up platform encoding and uses it even though you've specified different encoding to be used. Maven doesn't seem to care (it even prints to console that it's using UTF-8 but when you run a file with the code above, it won't display properly).

I've managed to tackle this issue by setting a system variable:

JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8

There should be another option instead of setting system variables and that is to set it as additional compiler parameter.

like javac -Dfile.encoding=UTF8

like image 147
Lenymm Avatar answered Sep 22 '22 00:09

Lenymm


You are mixing a few concepts here:

  • the project encoding is the encoding used to save the Java source files (xxxx.java) - it has nothing to do with how your code executes
  • test.getBytes("UTF-8") returns a series of bytes representing your String in UTF-8 encoding
  • to recreate the original string, you need to explicitly give the encoding, unless it is the default of your machine: new String(test.getBytes("UTF-8"), StandardCharsets.UTF_8)
like image 42
assylias Avatar answered Sep 21 '22 00:09

assylias