Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Charset.defaultCharset() get different result under JDK1.7 and JDK 1.6

I am testing my application's i18n compatibility. I have a English version of Windows 7 which mean the system's display language is English. And I set the system locale as Chinese for non-unicode application.

My application encountered problems when exporting Html files with Chinese character under jdk1.6, but works fine when running under jdk1.7.

I debugged it and found the direct reason was that Charset.defaultCharset() returned different values.

Under jdk1.7 Charset.defaultCharset() returned GBK which is the charset for chinese.

Under jdk1.6 Charset.defaultCharset() returned window_1252 which is charset for Latin language.

I know the problem can be solved by designate charset,say utf-8, in code.

But I want to know why Charset.defaultCharset() return different values under JDK1.7 and JDK 1.6 .

like image 289
LiuJian Avatar asked Nov 18 '11 02:11

LiuJian


1 Answers

Charset.defaultCharset() gives the charset of JVM running, so it is not always the same value. For example if you are running your programs with Netbeans, it will always return UTF-8, since that's the default encoding for Java Projects in Netbeans.

I have a setup similar to yours. My Windows is English (menus, dialogs are English) and I'm using Turkish for non-Unicode applications. When I start JVM without any flag or system parameter, both Java 7 and Java 6 runtimes give "CP1254" when Charset.defaultCharset() is called. System.getProperty("file.encoding") and default IO encoding are also the same. ( The locale of the system is different in these two Java versions, however that's another story. )

So I guess your problem is either about how you start your JVM, or about how JVM decides to default encoding it should use. If you are sure that the problem is not the former one (you run JVM without any encoding parameter and you do not attempt to change the default charset anywhere in your program), then JVM fetches the default encoding incorrectly and most probably that's abnormal behaviour.

like image 170
infiniteRefactor Avatar answered Sep 23 '22 01:09

infiniteRefactor