Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java: How to detect (and change?) encoding of System.console?

I have a program which runs on a console and its Umlauts and other special characters are being output as ?'s on Macs. Here's a simple test program:

public static void main( String[] args ) {
    System.out.println("höhößüä");
    System.console().printf( "höhößüä" );
}

On a default Mac console (with default UTF-8 encoding), this prints:

 h?h????
 h?h????

But after manually setting the Mac terminal's encoding to "Mac OS Roman", it correctly printed

 höhößüä
 höhößüä

Note that on Windows systems using System.console() works:

 h÷h÷▀³õ
 höhößüä

So how do I make my program...rolleyes..."run everywhere"?

like image 650
Epaga Avatar asked Mar 10 '10 09:03

Epaga


People also ask

What is Java default encoding?

encoding attribute, Java uses “UTF-8” character encoding by default. Character encoding basically interprets a sequence of bytes into a string of specific characters. The same combination of bytes can denote different characters in different character encoding.

Which character encoding is used in Java?

The native character encoding of the Java programming language is UTF-16. A charset in the Java platform therefore defines a mapping between sequences of sixteen-bit UTF-16 code units (that is, sequences of chars) and sequences of bytes.


2 Answers

Try the following command-line argument when starting your application:

-Dfile.encoding=utf-8

This changes the default encoding of the JVM for I/O operations.

You can also try:

System.setOut(new PrintStream(System.out, true, "utf-8"));
like image 192
Bozho Avatar answered Oct 18 '22 02:10

Bozho


Epaga: have a look right here. You can set the output encoding in a printstream - just have to determine or be absolutely sure about which is being set.

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;

public class Test {
    public static void main (String[] argv) throws UnsupportedEncodingException {
    String unicodeMessage =
    "\u7686\u3055\u3093\u3001\u3053\u3093\u306b\u3061\u306f";

    PrintStream out = new PrintStream(System.out, true, "UTF-8");
    out.println(unicodeMessage);
  }
}

To determine the console encoding you could use the system command "locale" and parse the output which - on a german UTF-8 system looks like:

LANG="de_DE.UTF-8"
LC_COLLATE="de_DE.UTF-8"
LC_CTYPE="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_MONETARY="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_ALL=
like image 42
gamma Avatar answered Oct 18 '22 00:10

gamma