Setting the default Java character encoding

People also ask

Is Java UTF-8 or 16?

Java programming language has extensive support for different charset and character encoding, by default it uses UTF-8.

Unfortunately, the file.encoding property has to be specified as the JVM starts up; by the time your main method is entered, the character encoding used by String.getBytes() and the default constructors of InputStreamReader and OutputStreamWriter has been permanently cached.

As Edward Grech points out, in a special case like this, the environment variable JAVA_TOOL_OPTIONS can be used to specify this property, but it's normally done like this:

java -Dfile.encoding=UTF-8 … com.x.Main

Charset.defaultCharset() will reflect changes to the file.encoding property, but most of the code in the core Java libraries that need to determine the default character encoding do not use this mechanism.

When you are encoding or decoding, you can query the file.encoding property or Charset.defaultCharset() to find the current default encoding, and use the appropriate method or constructor overload to specify it.

From the JVM™ Tool Interface documentation…

Since the command-line cannot always be accessed or modified, for example in embedded VMs or simply VMs launched deep within scripts, a JAVA_TOOL_OPTIONS variable is provided so that agents may be launched in these cases.

By setting the (Windows) environment variable JAVA_TOOL_OPTIONS to -Dfile.encoding=UTF8, the (Java) System property will be set automatically every time a JVM is started. You will know that the parameter has been picked up because the following message will be posted to System.err:

Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8

I have a hacky way that definitely works!!

System.setProperty("file.encoding","UTF-8");
Field charset = Charset.class.getDeclaredField("defaultCharset");
charset.setAccessible(true);
charset.set(null,null);

This way you are going to trick JVM which would think that charset is not set and make it to set it again to UTF-8, on runtime!

I think a better approach than setting the platform's default character set, especially as you seem to have restrictions on affecting the application deployment, let alone the platform, is to call the much safer String.getBytes("charsetName"). That way your application is not dependent on things beyond its control.

I personally feel that String.getBytes() should be deprecated, as it has caused serious problems in a number of cases I have seen, where the developer did not account for the default charset possibly changing.

I can't answer your original question but I would like to offer you some advice -- don't depend on the JVM's default encoding. It's always best to explicitly specify the desired encoding (i.e. "UTF-8") in your code. That way, you know it will work even across different systems and JVM configurations.

Try this :

    new OutputStreamWriter( new FileOutputStream("Your_file_fullpath" ),Charset.forName("UTF8"))

I have tried a lot of things, but the sample code here works perfect. Link

The crux of the code is:

String s = "एक गाव में एक किसान";
String out = new String(s.getBytes("UTF-8"), "ISO-8859-1");

Related questions
                            
                                Java Date vs Calendar
                            
                                How can I convert a long to int in Java?
                            
                                Array or List in Java. Which is faster?
                            
                                When to use StringBuilder in Java [duplicate]
                            
                                What causes and what are the differences between NoClassDefFoundError and ClassNotFoundException?
                            
                                How to match "any character" in regular expression?
                            
                                Difference between applicationContext.xml and spring-servlet.xml in Spring Framework
                            
                                HTTP URL Address Encoding in Java
                            
                                How can a Java program get its own process ID?
                            
                                How to avoid "ConcurrentModificationException" while removing elements from `ArrayList` while iterating it? [duplicate]
                            
                                Handling InterruptedException in Java
                            
                                How do I get the SharedPreferences from a PreferenceActivity in Android?
                            
                                Error:java: invalid source release: 8 in Intellij. What does it mean?
                            
                                Java Interfaces/Implementation naming convention [duplicate]
                            
                                Jackson Vs. Gson [closed]
                            
                                Does use of final keyword in Java improve the performance?
                            
                                What is the difference between the HashMap and Map objects in Java?
                            
                                Why would you ever implement finalize()?
                            
                                Data access object (DAO) in Java
                            
                                Getting an element from a Set

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Setting the default Java character encoding

Tags:

java

character-encoding

utf-8

People also ask

Recent Activity

Donate For Us