Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to force a jar to use(or the jvm that jar runs in) utf-8 instead of the system's default encoding

Tags:

My Windows's default encoding is GBK, and my Eclipse is totally utf-8 encoded.
So an application which runs well in my Eclipse, crashes because the words become unreadable when exported as a jar file;
I have to write the following line in a .bat file to run the application

   start java -Dfile.encoding=utf-8 -jar xxx.jar     

Now my question is that can I write something in the source code to set the application uses(or the jvm runs in) utf-8 instead of the system's default encoding.

like image 824
Aloong Avatar asked Nov 11 '10 21:11

Aloong


People also ask

Which method is used for using default character encoding of the OS?

defaultCharset() method returns the default charset that is being used.

How do I change the encoding of a properties file in eclipse?

properties files are Latin1 (ISO-8859-1) encoded by definition. ISO-8859-1 as its default encoding. You can change this under: Preferences > General > Content Types.


1 Answers

When you open a file for reading, you need to explicitly specify the encoding you want to use for reading the file:

Reader r = new InputStreamReader(new FileInputStream("myfile"), StandardCharsets.UTF_8); 

Then the value of the default platform encoding (which you can change using -Dfile.encoding) no longer matters.

Note:

I would normally recommend to always specify the encoding explicitly for any operation that depends on the standard locale, such as character I/O. Many Java API methods default to the platform encoding, which I consider a bad design, because often the platform encoding is not the right one, plus it may suddenly change (if the user e.g. switches OS locale), breaking your app.

So just always say which encoding you want.

There are some cases where the platform encoding is the right one (such as when opening a file the user just created for you), but they are fairly rare.

Note 2:

java.nio.charset.StandardCharsets was introduced in Java 1.7. For older Java versions, you need to specify the input encoding as a String (ugh). The list of possible encodings depends on the JVM, but every JVM is guaranteed to at least have:

US-ASCII, ISO-8859-1,UTF-8,UTF-16BE,UTF-16LE,UTF-16.

like image 170
sleske Avatar answered Dec 03 '22 06:12

sleske